Insert Data into MySQL in multiple Tables in C# efficiently -


i need insert huge csv-file 2 tables 1:n relationship within mysql database.

the csv-file comes weekly , has 1gb, needs append existing data. each of them 2 tables have auto increment primary key.

i've tried:

  • entity framework (takes time of approaches)
  • datasets (same)
  • bulk upload (doesn't support multiple tables)
  • mysqlcommand parameters (needs nested, current approach)
  • mysqlcommand storedprocedure including transaction

any further suggestions?

let's simplified datastructure:

public class user {     public string firstname { get; set; }     public string lastname { get; set; }     public list<string> codes { get; set; } } 

i need insert csv database:

       user   (1-n)   code      +---+-----+-----+ +---+---+-----+         |pid|fname|lname| |cid|pid|code |  +---+-----+-----+ +---+---+-----+ | 1 |jon  | foo | | 1 | 1 | ed3 |  | 2 |max  | foo | | 2 | 1 | wst |  | 3 |paul | foo | | 3 | 2 | xsd |  +---+-----+-----+ +---+---+-----+  

here sample line of csv-file

jon;foo;ed3,wst 

a bulk load load data local infile not possible because have restricted writing rights

given great size of data, best approach (performance wise) leave data processing database , not application.

create temporary table data .csv file temporarily saved.

create table `imported` (     `id` int(11) not null,     `firstname` varchar(45) default null,     `lastname` varchar(45) default null,     `codes` varchar(450) default null,     primary key (`id`) ) engine=innodb default charset=utf8; 

loading data .csv table pretty straightforward. suggest use of mysqlcommand (which current approach). also, using same mysqlconnection object insert statements will reduce total execution time.

then furthermore process data, can create stored procedure handle it.

assuming these 2 tables (taken simplified example):

create table `users` (   `pid` int(11) not null auto_increment,   `fname` varchar(45) default null,   `lname` varchar(45) default null,   primary key (`pid`) ) engine=innodb auto_increment=3737 default charset=utf8; 

and

create table `codes` (   `cid` int(11) not null auto_increment,   `pid` int(11) default null,   `code` varchar(45) default null,   primary key (`cid`) ) engine=innodb auto_increment=15 default charset=utf8; 

you can have following stored procedure.

create definer=`root`@`localhost` procedure `import_data`() begin     declare fname varchar(255);     declare lname varchar(255);     declare codesstr varchar(255);     declare splitted_value varchar(255);     declare done int default 0;     declare newid int default 0;     declare occurance int default 0;     declare int default 0;      declare cur cursor select firstname,lastname,codes imported;     declare continue handler not found set done = 1;      open cur;      import_loop:          loop fetch cur fname, lname, codesstr;             if done = 1                 leave import_loop;             end if;              insert users (fname,lname) values (fname, lname);             set newid = last_insert_id();              set i=1;             set occurance = (select length(codesstr) - length(replace(codesstr, ',', '')) + 1);              while <= occurance                 set splitted_value =                     (select replace(substring(substring_index(codesstr, ',', i),                     length(substring_index(codesstr, ',', - 1)) + 1), ',', ''));                  insert codes (pid, code) values (newid, splitted_value);                 set = + 1;             end while;         end loop;     close cur; end 

for every row in source data, makes insert statement user table. there while loop split comma separated codes , make each 1 insert statement codes table.

regarding use of last_insert_id(), reliable on per connection basis (see doc here). if mysql connection used run stored procedure not used other transactions, use of last_insert_id() safe.

the id generated maintained in server on per-connection basis. means value returned function given client first auto_increment value generated recent statement affecting auto_increment column client. value cannot affected other clients, if generate auto_increment values of own. behavior ensures each client can retrieve own id without concern activity of other clients, , without need locks or transactions.

edit: here op's variant omits temp-table imported. instead of inserting data .csv imported table, call sp directly store them database.

create definer=`root`@`localhost` procedure `import_data`(in fname varchar(255), in lname varchar(255),in codesstr varchar(255)) begin     declare splitted_value varchar(255);     declare done int default 0;     declare newid int default 0;     declare occurance int default 0;     declare int default 0;      insert users (fname,lname) values (fname, lname);     set newid = last_insert_id();      set i=1;     set occurance = (select length(codesstr) - length(replace(codesstr, ',', '')) + 1);      while <= occurance         set splitted_value =             (select replace(substring(substring_index(codesstr, ',', i),             length(substring_index(codesstr, ',', - 1)) + 1), ',', ''));          insert codes (pid, code) values (newid, splitted_value);         set = + 1;     end while; end 

note: code split codes taken here (mysql not provide split function strings).


Comments

Popular posts from this blog

java - How to specify maven bin in eclipse maven plugin? -

single sign on - Logging into Plone site with credentials passed through HTTP -

php - Why does AJAX not process login form? -