CFP

ICDM MLCS 2020: 2nd IEEE ICDM Workshop on Multilingual Cognitive Services

Sorrento, Italy, November 17-21, 2020

Conference website	https://bit.ly/ICDM-MLCS-2020
Submission link	https://easychair.org/conferences/?conf=icdmmlcs2020
Submission deadline	August 31, 2020

Topics: data mining machine learning natural language processing multilingual speech recognition

Multilingual Cognitive Services (MLCS)

Call for Papers and Participation

ICDM 2020 workshop - November, 2020

Organizing Committee

William H. Hsu

Professor, Computer Science, Kansas State

E-mail: bhsu@ksu.edu

Google Scholar: http://bit.ly/hsu-gs

Kalika Bali

Researcher, Microsoft

Microsoft Research Labs India

E-mail: kalikab@microsoft.com

CS Bibliography: http://bit.ly/kalikab-dblp

Yihong Theis

Graduate Student, Computer Science, Kansas State

E-mail: yihong@ksu.edu

Google Scholar: http://bit.ly/yihong-gs

Technical Description of the Workshop

Many aspects of artificial intelligence, including search, question answering, and Internet of Things automation in home assistants, rely on robust cognitive services such as natural language understanding from speech and text. One of the technical challenges that remains an open research area and is coming to the forefront of this field is that of adapting cognitive services across languages, to serve a worldwide community of multilingual and multicultural users. This workshop will address research problems in cognitive service design and development that center around multilingual translation, speech recognition, and - to a degree - text understanding. These problems include:

Code mixing: mixed languages in speech and text (conversation, queries, commands)
Language recognition: identifying languages in small units of mixed natural language
Accents: identifying and adapting to regional accents and second-language speakers
Dialogue agents: responses; handling language switching in conversational contexts
Standardization/transcription: translating mixed texts and transcripts to one language

Nearly 20% of people in the United States, and 56% in Europe, consider themselves to be multilingual. Self-described bilingual speakers number 43% worldwide and trilingual speakers 13%; only 40% of people across the world are monolingual as of 2018. In recent years, there has been extensive research on cognitive services, language detection and monolingual translation; however, as globalization adds increasing numbers of multilingual users, the topic of multilingual cognitive services is becoming more prominent, with its own technical challenges, methodologies, and user needs. This workshop aims at gathering data science and machine learning researchers from many related areas to discuss how to meet these challenges and needs with new data mining approaches.

For example, there are many different brands of home assistants in different countries. However, when they are used by multilingual speakers, failures of natural language recognition by cognitive services can greatly diminish their accessibility and usability, to the point that they become less practical in their primary purpose (speech-based functions) than mobile devices and applications. When multilingual speakers ask for music by their favorite creative artists or search for information on notable people, places, and things, they are often unable to use native personal and place names, or local terms, because these embedded named entities may be treated as foreign phrases by a regionalized cognitive service. The crucial issue is that most cognitive services are regionalized to be intrinsically monolingual, an assumption that is part of the inherent problem for the large and growing body of multilingual users.

Therefore, we seek to bring together researchers from different fields of data mining, including transdisciplinary and interdisciplinary data scientists, to discuss their innovations, views, and visions regarding cutting-edge cognitive services technology.

Active research areas that are related to cognitive services include:

Data mining and computational linguistics in multilingual domains
Multimodal data science, especially video (dialogues, speechreading)
Machine learning using multilingual natural language data, including text/transcripts
Multilingual speech recognition/prediction with deep learning/artificial neural nets
Human-centered computing, including cognitive models and user modeling
Home assistants and other dialogue agents
Machine translation
Human-robot interaction (HRI) and human-computer interaction (HCI)
Usability of interactive services: how to respond to multilingual queries and dialogue
User adaptation and personalization
Understanding emotions in user context: home/work, friends/strangers, online/in person

The emphasis of this workshop shall be approaches based on the above methodologies.

Intended Audience and Impact

We welcome paper submissions from researchers in all areas of domain adaptation in cognitive services, particularly:

data mining for cognitive user modeling, adaptation, and personalization
higher-level tasks: question answering (QA) and knowledge-base population
speech recognition
machine translation and language recognition
natural language processing

We also hope to attract ICDM participants from industrial R&D with interesting current applications that showcase multilingual aspects of social media.

Workshop Logistics

Paper Presentations, Invited Talks, and Panel Session: Morning and Afternoon

The workshop will be a full-day event featuring morning and afternoon technical sessions. In the spirit of fostering new research and collaboration, care will be taken to reserve sufficient time for discussions and questions. The program committee will aim at accepting about 5-10 technical papers for full oral presentation.

Following brief welcoming remarks, a 2-3 hour session will consist of approximately half the oral technical presentations. One or more invited talks following the lunch break will be aimed at serving the interests of a variety of intelligent systems researchers and attracting new researchers to topics in multilingual cognitive services: natural language processing tasks for speech and text-based services and what is distinct about multilingual versions of these tasks; code mixing and code switching; language detection; translation; question answering; and responses and conversation.

The second session will include the second half of the technical papers, concluding with a brief open discussion about possible special issues of journals on the topic. This session may include a panel discussion on the future, cultural significance, and impact of multilingual cognitive services, from serving the diverse needs of global economies, to privacy and safety concerns, to extant and emerging biases and the need for fairness, accountability, and transparency. The goal of these afternoon sessions is to provide additional opportunities for cross-fertilization between academic and industrial research, through introduction of applications and methodologies that may otherwise be unfamiliar to participants in diverse areas.

Poster Session(s) and Proposed Data Challenge

A poster intersession and post-session is planned, for posters that will be displayed throughout the workshop. Prospective participants may submit short papers on work in progress or draft posters.

Finally, the organizing committee has planned a data challenge consisting of a multilingual speech data set (and possibly a text corpus) to be distributed around May. This data set will be developed by the organizing and program committees working with our industry partners and any sponsors. The challenge will be open for at least three months to allow for maximum participation and will close in late August to mid-September.

Important Dates

Full Papers Due	Monday, August 24, 2020 (extended August 31, 2020)
Short Papers and Posters Due	Monday, August 24, 2020 (extended August 31, 2020)
Acceptance Notification	September 17, 2020
Camera-ready copy due	September 24, 2020
Workshop	(Proposed) Thursday, November 17, 2020

Submission

Please submit your paper through ICDM submission site: https://bit.ly/MLCS2020

Call for Papers and Submission Categories

We encourage submissions containing original theoretical and applied concepts in artificial intelligence with applications to multilingual cognitive services, speech recognition and machine translation. Experimental results are also encouraged, especially on fielded applications, even if they are only preliminary. We therefore invite two categories of paper submissions:

research papers
- should not exceed 12 pages including title page
- due: Monday, August 24, 2020 (main workshop due date)

short summaries (including position papers and poster papers)
- should not exceed 4 pages
- due: Monday, August 24, 2020

DUAL SUBMISSION POLICY: Submission of short (2-4 page) synopsis of articles currently in preparation, under review, or accepted for publication as journals or book chapters is permitted. Submission of full-length (6-12 page) papers currently under review for other conferences and workshops is also permitted. However, these papers shall be published in the working notes for this workshop if and only if they are compliant with the dual submission guidelines of the other conference or workshop.

SUBMISSION SITE, FORMAT, AND DOUBLE-BLIND POLICY: Papers should be submitted in PDF form using EasyChair. We request that authors prepare papers in the the IEEE 2-column format used by the ICDM main conference (template website). The first page of submitted papers should include the title and a brief abstract. Author names, affiliations, postal addresses, electronic mail addresses, and telephone and fax numbers should be omitted for double-blind peer review.

Organizing Committee

William Hsu, Professor, Kansas State University

Kalika Bali, Researcher, Microsoft

Yihong Theis, Graduate Student, Kansas State University

Program Committee

William Hsu, Professor, Kansas State University

Yihong Theis, Graduate Researcher, Kansas State University

Kalika Bali, Principal Researcher (Director of Multilingual Services), Microsoft Research India

Subhash Chandra, Assistant Professor, University of Delhi

Monojit Choudhury, Principal Researcher, Microsoft Research India

Emre Yılmaz, Researcher, National University of Singapore, Singapore

Haizhou Li, Professor, Department of Electrical & Computer Engineering, National University of Singapore

Tien-Ping Tan, Senior Lecturer, Universiti Sains Malaysia

Confirmed Invited Speakers (in addition to above)

Sunayana Sitaram, Researcher, Microsoft

Subhash Chandra, Assistant Professor, University of Delhi

(1-2 additional invited speakers and/or a discussion panel planned)

Past Workshops

Past Workshops by the Organizers

Multilingual Cognitive Services - ICDM 2019, Beijing China (6 papers, 12 participants)

William Hsu, organizing co-chair of this proposed workshop, has organized a total of 15 workshops, all on data mining topics, at the following venues: IJCAI 2001, 2003, 2011 (2), 2013, 2015, 2016, 2017 (2), 2018, and 2019, KDD 2002 (jointly with AAAI and UAI), SocInfo 2016, ICDM 2019, and AAAI 2020.

Relevant Past Workshops

Dialog system technology challenge - AAAI 2019

Reasoning and Learning for Human-Machine Dialogues - AAAI 2019

The 2018 International Workshop on Data-Driven Granular Cognitive Computing - ICDM 2018

Linguistic and Cognitive Approaches To Dialog Agents - IJCAI 2018

First International Workshop on Socio-Cognitive systems - IJCAI 2018

The 1st International Workshop on Software Engineering for Cognitive Services - ICSE 2018Workshop on Cognitive Knowledge Acquisition and Applications - IJCAI 2017

Cognition and Artificial Intelligence for Human-Centered Design - IJCAI 2017Human Machine Collaborative Learning - AAAI 2017