Bootcamp for NLP

Learn and implement an end-to-end deep learning models for natural language processing.

09:00 AM to 5:00 PM, 28-29 July 2018, ThoughtFactory, Bangalore

Think of your favorite NLP application that you wish to build - sentiment analysis, named entity recognition, machine translation, information extraction, text summarization, recommender system, to name a few. Recent advances in DL have acted as a great catalyst for pushing the boundaries of NLP

However, feature engineering still remains a critical coponent for any NLP task. Unlike images, where directly using the intensity of pixels is a natural way to represent the image; in case of text there is no such natural representation. No matter how good is your ML/DL algorithm, it can do only so much unless there is a richer way to represent underlying text data. Thus, whatever NLP application you are building, it’s imperative to find a good representation for your text data.

In this bootcamp, we will understand key concepts, maths, and code behind the state-of-the-art NLP techniques. Various representation learning techniques have been proposed in literature, but still there is a dearth of comprehensive tutorials that provides full coverage with mathematical explanations as well as implementation details of these algorithms to a satisfactory depth.

This bootcamp aims to bridge this gap. It aims to demystify, both - Theory (key concepts, maths) and Practice (code) that goes into building NLP models. At the end of this bootcamp participants would have gained a fundamental understanding of these approaches with an ability to implement them on datasets of their interest.

Target Audience

  • Data Science practitioners
  • Corporates and Start-ups working with NLP
  • Anyone (researcher, student, professional) working NLP


This is a very hands-on course and hence, participants should be comfortable with programming. Familiarity with python data stack is ideal. Prior knowledge of machine learning will be helpful.


The material for the bootcamp is hosted on github. You can find slides for this workshop here.

This is from the popular bootcamp series by the speakers on NLP. Additional materials relevant would be shared prior to the bootcamp.


This would be a two-day instructor-led hands-on bootcamp to learn and implement an end-to-end deep learning models for natural language processing.

  • Day1 will cover introduction to text representation, old ways of representing text, followed by a deep dive into embedding spaces and word vectors.
  • Day2 will cover more advanced techniques of representing text such as Paragraph2vec/doc2vector techniques and various architectures for char2vec.

There will be four sessions of three hours each over two days .

Session 1: Introduction to representation learning

  1. What is representation learning?
  2. Use cases in natural language processing.
  3. Old ways of representing text
    • One-hot encoding
    • Tf-idf
    • N-grams
  4. How to use pre-trained word embedding?

Session 2: Word-vectors

  1. Introduction to word-vectors?
  2. Different techniques of generating word-vectors
    • CBOW, Skip-gram model
    • Glove model
  3. Detailed implementation of each of these models in tensorflow
  4. Negative sampling, hierarchical softmax, tSNE
  5. Fine-tuning pretrained embeddings

Session 3: Sentence2vec/Paragraph2vec/Doc2vec

  1. Extending word vectors to represent sentences/paragraphs/documents
  2. Various techniques for training doc2vec
    • Doc2vec i. DM ii. DBOW
    • Skip - thoughts
  3. Detailed implementation of each of these models in tensorflow

Session 4: Char2vec

  1. Building character embeddings
  2. Tweet2vec - character embeddings from social data
  3. CNN for character vectors.
  4. fastText - character n-gram embeddings

Software Requirements

We will be using Python data stack for this bootcamp with keras and tensorflow for the deep learning component. Additional requirement will be communicated to participants.


Anuj Gupta

Director - Machine Learning, Huawei Technologies





ThoughtFactory, Tower D, 2nd Floor, Diamond District, Bengaluru, Karnataka 560102