2018-03-04: Installing Stanford CoreNLP in a Docker Container
Fig. 1: Example of Text Labeled with the CoreNLP Part-of-Speech, Named-Entity Recognizer and Dependency Annotators. Click to expand image. |
The Stanford CoreNLP suite provides a wide range of important natural language processing applications such as Part-of-Speech (POS) Tagging and Named-Entity Recognition (NER) Tagging. CoreNLP is written in Java and there is support for other languages. I tested a couple of the latest Python wrappers that provide access to CoreNLP but was unable to get them working due to different environment-related complications. Fortunately, with the help of Sawood Alam, our very able Docker campus ambassador at Old Dominion University, I was able to create a Dockerfile that installs and runs the CoreNLP server (version 3.8.0) in a container. This eliminated the headaches of installing the server and also provided a simple method of accessing CoreNLP services through HTTP requests.
How to run the CoreNLP server on localhost port 9000 from a Docker container
- Install Docker if not already available
- Pull the image from the repository and run the container:
The server can be used either from the browser or the command line or custom scripts:
- Browser: To use the CoreNLP server from the browser, open your browser and visit http://localhost:9000/. This presents the user interface (Fig. 1) of the CoreNLP server.
- Command line (NER example):
Fig. 2: Sample request URL sent to the Named Entity Annotator. Click to expand image.
- Custom script (NER example): I created a Python function nlpGetEntities() that uses the NER annotator to label a user-supplied text.
To stop the server, issue the following command:
The Dockerfile I created targets CoreNLP version 3.8.0 (2017-06-09). There is a newer version of the service (3.9.1). I believe it should be easy to adapt the Dockerfile to install the latest version by replacing all occurrences of "2017-06-09" with "2018-02-27" in the Dockerfile. However, I have not tested this operation since version 3.9.1 is marginally different from version 3.8.0 for my use case, and I have not tested version 3.9.1 with my application benchmark.
--Nwala
--Nwala
Comments
Post a Comment