My attempt at the NLP workshop

Tolstoyevsky 1500ed2ed4 Small touches to DVC notebook 2 months ago
.dvc 1a2517aa68 Added a dvc stage to download data from gdrive 2 months ago
data 8873b385c6 Updated 0.zip.dvc 2 months ago
model dfdc2fb905 model_params.json is now a dvc output 2 months ago
notebooks 1500ed2ed4 Small touches to DVC notebook 2 months ago
python 4a7abafeac Straighten out imports for gorenml 2 months ago
templates b26b03ff09 Adding flask server 6 months ago
.gitignore 19bb546f82 Adding PyCharm files to gitignore 3 months ago
README.md 0e68946a89 mac support 2 months ago
requirements.txt 8d050f4196 Adding matplotlib to requirements 2 months ago
server.py ab36accf3f Import fixes in server.py and style_predict.py 2 months ago

Data Pipeline

Legend
DVC Managed File
Stage File
Code File
Metric

README.md

Named-Entity-Recognition Workshop

In this workshop, we would learn how to automatically style ( bold , Italics, etc. ) a word according to context.

We learn styling from html files automatically and apply them to raw text.

This project is used mainly to demonstrate deep-learning implementation of named-entity-recognition (NER) models.

Preparing the environment (locally)

#####Note: in case you are not using Colab

  1. Make sure Python3 is installed.
  2. You can create you virtual environment (recommended) using python3 -m virtualenv ner_ws
  3. To activate your virtual env, run: source ner_ws/bin/activate
  4. Now install all of the requirements: pip3 install -r requirements.txt[](https://www.python.org/downloads/release/python-364/)

    Usage

    1. Run style_extract.py to generate training files from html.
    2. Put the .zip file in the data/ folder.
    3. Run style_learn.py to train an NER model.
    4. Run server.py to evaluate your model in the browser.

For more details, contact me at goren.ml .