schema problems

CREATE TABLE hostnames (
hostname varchar(255),
ip int unsigned,
PRIMARY KEY (hostname)

-- Table of flows tables
CREATE TABLE flow_tables (
table_name varchar(64), -- name of the table
src_ip int unsigned, -- IP source address
slice varchar(32), -- slice that sent these flows
start_time datetime, -- start time of the earliest flow in the table
end_time datetime, -- end time of the latest flow in the table
flows int unsigned, -- total number of flows in this table
packets bigint unsigned, -- total number of packets sent in all flows in this table

June 26, 2007

The database is horribly inefficient.  Looking into a new schema that will use the same data inputs and actually use relations within the database to link the data.  Currently, a table is created for each flow and then an entry is made into a main table which keeps track of all the flow tables.  In order to query this data, you have to run 2 queries: 1 to the main TOC table to find the name of the flow table, then another to query that table and get the flow data.  This is a perfect example of how NOT to use a relational database.  Looking into the most effecient storage model and nomalization for this system.  2NF should be fine due to the nature of the data.

PlanetFlow Update


Works in progress:

  • DB normalization/optimization
  • GUI update
  • Usability analysis
  • Kernal update
  • New data storage and retrieval
Syndicate content