Extreme Multi-Label Classification for Information Retrieval XMLC4IR

Extreme Multi-Label Classification for Information Retrieval XMLC4IR

Extreme Multi-label Classification for Information Retrieval XMLC4IR Tutorial at ECIR 2018 Rohit Babbar1 and Krzysztof Dembczy´nski2 1Aalto University, Finland, 2Pozna´nUniversity of Technology, Poland European Conference on Information Retrieval Grenoble, France, March 26, 2018 • Rohit Babbar: I Affiliation: University Aalto I Previous affiliations: Max-Planck Institute for Intelligent Systems, Universit´eGrenoble Alpes I Main interests: extreme classification, multi-label classification, multi-class classification, text classification 1 / 94 • Krzysztof Dembczy´nski: I Affiliation: Poznan University of Technology I Previous affiliations: Marburg University I Main interests: extreme multi-machine learning, multi-label classification, label tree algorithms, learning theory 2 / 94 Agenda 1 Extreme classification: applications and challenges 2 Algorithms I Label embeddings I Smart 1-vs-All approaches I Tree-based methods I Label filtering/maximum inner product search 3 Live demonstration Webpage: http://www.cs.put.poznan.pl/ kdembczynski/xmlc-tutorial-ecir-2018/ 3 / 94 Agenda 1 Extreme classification: applications and challenges 2 Algorithms I Label embeddings I Smart 1-vs-All approaches I Tree-based methods I Label filtering/maximum inner product search 3 Live demonstration Webpage: http://www.cs.put.poznan.pl/ kdembczynski/xmlc-tutorial-ecir-2018/ 3 / 94 Extreme multi-label classification is a problem of labeling an item with a small set of tags out of an extremely large number of potential tags. 4 / 94 Geoff Hinton Andrew Ng Yann LeCun Yoshua Bengio 5 / 94 Geoff Hinton Andrew Ng Yann LeCun Yoshua Bengio 5 / 94 Geoff Hinton Andrew Ng Yann LeCun Yoshua Bengio 5 / 94 6 / 94 Alan Turing, 1912 births, 1954 deaths 20th-century mathematicians, 20th-century philosophers Academics of the University of Manchester Institute of Science and Technology Alumni of King's College, Cambridge Artificial intelligence researchers Atheist philosophers, Bayesian statisticians, British cryptographers, British logicians British long-distance runners, British male athletes, British people of World War II Computability theorists, Computer designers, English atheists English computer scientists, English inventors, English logicians English long-distance runners, English mathematicians English people of Scottish descent, English philosophers, Former Protestants Fellows of the Royal Society, Gay men Government Communications Headquarters people, History of artificial intelligence Inventors who committed suicide, LGBT scientists LGBT scientists from the United Kingdom, Male long-distance runners Mathematicians who committed suicide, Officers of the Order of the British Empire People associated with Bletchley Park, People educated at Sherborne School People from Maida Vale, People from Wilmslow People prosecuted under anti-homosexuality laws, Philosophers of mind Philosophers who committed suicide, Princeton University alumni, 1930-39 Programmers who committed suicide, People who have received posthumous pardons Recipients of British royal pardons, Academics of the University of Manchester Suicides by cyanide poisoning, Suicides in England, Theoretical computer scientists 6 / 94 New question ) Assignment/recommendation of users 7 / 94 Selected item ) Recommendation of top 3 items 8 / 94 Sequence of words ) Recommendation of the next word 9 / 94 Possible bid phrases: • Zurich car insurance • Car insurance • Auto insurance • Vehicle insurance • Electric car insurance On-line ad ) Recommendation of queries to an advertiser 10 / 94 Suggestion of top Twitter Trends 11 / 94 • Multi-label classification: d h(x) m x = (x1; x2; : : : ; xd) 2 R −−−−−! y = (y1; y2; : : : ; ym) 2 f0; 1g x1 x2 : : : xd y1 y2 : : : ym x 4.0 2.5 -1.5 1 1 0 Setting • Multi-class classification: d h(x) x = (x1; x2; : : : ; xd) 2 R −−−−−! y 2 f1; : : : ; mg x1 x2 : : : xd y x 4.0 2.5 -1.5 5 12 / 94 Setting • Multi-class classification: d h(x) x = (x1; x2; : : : ; xd) 2 R −−−−−! y 2 f1; : : : ; mg x1 x2 : : : xd y x 4.0 2.5 -1.5 5 • Multi-label classification: d h(x) m x = (x1; x2; : : : ; xd) 2 R −−−−−! y = (y1; y2; : : : ; ym) 2 f0; 1g x1 x2 : : : xd y1 y2 : : : ym x 4.0 2.5 -1.5 1 1 0 12 / 94 I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction • Predictive performance: • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction I Long-tail label distributions and zero-shot learning • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction • Computational complexity: Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning 13 / 94 I time vs. space I #examples vs. #features vs. #labels I training vs. validation vs. prediction Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: 13 / 94 I #examples vs. #features vs. #labels I training vs. validation vs. prediction Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: I time vs. space 13 / 94 I training vs. validation vs. prediction Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled learning I Long-tail label distributions and zero-shot learning • Computational complexity: I time vs. space I #examples vs. #features vs. #labels 13 / 94 Extreme classification Extreme classification ) a large number of labels m (≥ 105) • Predictive performance: I Performance measures: Hamming loss, prec@k, NDCG@k, Macro F I Learning theory for large m I Training and prediction under limited time and space budged I Learning with missing labels and positive-unlabeled

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    251 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us