World-Class Ed-Tech Institute For Cutting-Edge Training In AI & Blockchain

About the Course

From social media to news articles to machine logs, text data is everywhere. This class will teach you about Information Extraction – how to extract structured data from text in order to derive valuable insights.

You will learn about information extraction applications in various domains, such as social media, healthcare analytics, and financial risk analysis. You will also explore common text analytics tasks, including entity, relation, and event extraction, as well as sentiment analysis.

Finally, you will dive into "Declarative Information Extraction," a powerful method for doing high-performance and high-quality text analytics, and gain hands-on experience writing your own extractors.

Course Syllabus

Module 1 - Getting to Know Information Extraction

Module 2 - Limitations in Information Extraction

Module 3 - Getting to Know SystemT

Module 4 - Information Extraction with AQL

Module 5 - AQL Basics

Module 6 - Advanced AQL

Module 7 - Declarative Information Extraction and the SystemT Optimizer

Module 8 - Best Practices

General Information

Self-paced
Flexible enrolment
Audit multiple times

Recommended Existing Skills

None

Requirements

None

Course Staff

Yunyao Li

Yunyao Li is a Principle Research Staff Member and Senior Research Manager at IBM Almaden Research Center where she manages the Scalable Knowledge Intelligence department. She is also a Master Inventor and a member of IBM Academy of Technology. Her expertise lies is in the interdisciplinary areas of natural language processing, databases, human-computer interaction, and information retrieval. She is a founding member of SystemT , a state-of-the-art NLP system currently powering multiple IBM products, and numerous projects. She received her PhD and Master’s degrees from the University of Michigan Ann Arbor and Undergraduate degrees from Tsinghua University, Beijing, China. You can read about Yunyao's inspiring story from small-town China to Silicon Valley here. Follow her on Twitter @yunyao_li.

Laura Chiticariu

Laura Chiticariu is the Chief Architect of Watson Knowledge and Language Foundation, with technical leadership responsibilities over Watson Natural Language Understanding, Watson Knowledge Studio and Watson Knowledge Graph. Laura is a core member of the SystemT , R&D team, and strongly believes in the notion of "Transparent NLP" – leveraging machine learning techniques, while ensuring that the NLP system remains easy to comprehend, debug and enhance. She holds a Ph.D. in Computer Science, and has been teaching NLP across universities within and outside the U.S.

Marina Danilevsky

Marina Danilevsky is a Research Staff Member in the Scalable Knowledge Intelligence group at IBM Almaden Research Center and a core member of the SystemT R&D team. She works in the intersection of data analytics, text mining, natural language processing, information networks, and human-computer interaction. She holds a Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign and a B.S. in Mathematics from the University of Chicago.

Huaiyu Zhu

Huaiyu Zhu is a Research Staff Member in the Scalable Knowledge Intelligence group at IBM Almaden Research Center. His main research focus is in text analytics, natural language processing, machine learning and statistical information processing.

Atsushi Ono

Atsushi Ono is a software engineer at Tokyo Software & Systems Development Lab (TSDL), IBM Japan. After several years of experience on business partner technical enablement missions, he has been working as a front-end developer on various projects, including contributing to the open source Dojo Mobile project. He has worked on the development of IBM Watson Knowledge Studio since the project’s inception.

Yuka Nomura

Yuka Nomura is a software engineer working on front-end development of IBM Watson Knowledge Studio at Tokyo Software & Systems Development Lab (TSDL), IBM Japan. She has contributed to user interface design and product development from her very first project start-up. She also specializes in robot application programming that runs on communication robots such as Pepper.

Chikako Oyanagi

Chikako Oyanagi is a software engineer working on front-end development of IBM Watson Knowledge Studio at Tokyo Software & Systems Development Lab (TSDL), IBM Japan. She has contributed to user interface design and product development from her very first project start-up. She also specializes in robot application programming that runs on communication robots such as Pepper.

Teruki Tauchi

Teruki Tauchi is a front-end software developer of IBM Watson Knowledge Studio t Tokyo Software & Systems Development Lab (TSDL), IBM Japan. He joined IBM after obtaining a Master of Engineering degree in Computer Science from University College London in 2015.

Text Analytics 101

Course Details