data mining - how schema.org can help in nlp -
i working on nlp, collecting interest based data web pages.
i came across source http://schema.org/ being helpful in nlp stuff.
i go through documentation, can see adds additional tag properties identify html tag content.
it may search engine specific data per user query.
it says : schema.org provides collection of shared vocabularies webmasters can use mark pages in ways can understood major search engines: google, microsoft, yandex , yahoo!
but don't understand how can me being nlp guy? parse web page content process , extract data it. schema.org may there, don't know how utilize it.
any example or guidance appreciable.
schema.org uses microdata format representation. people use microdata text analytics , extracting curated contents. there can numerous application.
suppose want create news summarization system. can use
hnews
microformats extract relevant content , perform summrization onitsuppose if have review based search engine, want list products positive review. can use
hreview
microfomrat extract reviews, perform sentiment analysis on identify product has -ve or +ve review- if want create skill based resume classifier extract content
hresume
microformat. can give various details contact (uses hcard microformat), experience, achievements , related work, education , skills/qualifications, affiliations , publications , performance/skills performance etc. can perform classifier on classify cvs particular skillsets
thought schema.org not helps directly nlp
guys, provides platform perform text processing in better way.
check out http://en.wikipedia.org/wiki/microformat#specific_microformats see various mircorformat, same page give more details.
Comments
Post a Comment