java - Grouping similar items from CSV file column as primary key -


i have large csv file data similar this

user id       group abc           group1    def           group2 abc           group3 ghi           group4 xyz           group2 uvw           group5 xyz           group1 abc           group1 def           group2 

i need group these items in such way number of times group attribute repeated in user id , value such that

abc   group1 ->2 abc   group3 ->1 def   group2 ->2 ghi   group4 ->1 uvw   group5 ->1 xyz   group2 ->1 xyz   group1 ->1 

are there clustering algorithm this.

in case somethink if don't want store data in memory:

public class tester { public static multiset<string> getmultisetfromcsv(string csvfilename, string linedelimiter) throws ioexception {     multiset<string> mapper = treemultiset.create();       bufferedreader reader = null;      try {         reader = new bufferedreader(new filereader(csvfilename));          string[] currlinesplitted;          while(reader.ready()) {             currlinesplitted = reader.readline().split(linedelimiter);             mapper.add(currlinesplitted[0] + "-" + currlinesplitted[1]);         }          return mapper;     } {         if(reader != null)             reader.close();     }  }  public static void main(string[] args) throws ioexception {     multiset<string> set = getmultisetfromcsv("csv", ",");      for(string key : set.elementset()) {         system.out.println(key + " : " + set.count(key));     }   } 

}

in way you're able construct map easily. after that, each key can count number of items associated using count method.


Comments

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

python - Django-cities exits with "killed" -

python - How to get a widget position inside it's layout in Kivy? -