Building a NLP Platform to Keep US Voters Informed of Presidential Candidates’ Stances on Trending Topics

Background

The US Presidential Election is an important event to most citizens and, to an extent, the world in general. People want to know the ideas and opinions of the electoral candidates on specific topics, but it’s difficult to keep up with their stances both historically and in real-time and find this information all in one place. Thankfully, presidential debates are transcribed and candidates’ opinions on various topics are well documented, but unfortunately, this information is not presented in a way that’s easy to digest, leading people to misunderstand the information or not get informed at all.

Our customer, a Media Publishing Company (MPC), partnered with Amazon Web Services (AWS) and Wizeline to create a web platform to interpret user questions about electoral candidates. The idea was to build a natural language processing (NLP) platform that could serve up relevant quotes mentioned in electoral debates and public appearances based on user-specific questions.

Read more to see how Wizeline, AWS, and MPC kept US voters and the general public informed about what each electoral candidate’s stance on essential issues.

About the Customer

Our customer, a Media Publishing Company (MPC), was created in the late 19th century in New York City as a company specialized in conveying complex financial information to the general public. Thanks to their broad experience with financial data and political analysis, today MPC operates multiple media, news, and information brands worldwide and offers solutions for risk management, analytics research, and trading to other enterprises.

The Challenge

The platform’s main challenge was to understand questions written in natural language to simplify the user’s search task. The second challenge was taking the main idea from the user’s query and finding corresponding information in the transcript database, including transcripts from up to 30 years back in time. 

The platform would also need to offer users the latest relevant information sorted by candidate or issue, enabling users to filter them by date, candidate, or keyword. 

Lastly, since implementing machine learning and NLP can be expensive and time-consuming, the project needed a way to work around these constraints and still deliver excellent results.

Our Solution

Wizeline and AWS engineers created a web platform using both AWS applications and MPC’s proprietary tools. The platform implements Amazon Web services in three sections: frontend, backend, and data ingestion.

Frontend

The platform is open to the public, which means it doesn’t implement authentication or login services. To access the platform’s data, it exposes the backend endpoints via Amazon Route 53 DNS. The platform’s frontend is divided into three repositories:

  • Repository one containing the interface’s React elements
  • Repository two managing the API calls needed to interact with users’ actions
  • Repository three tying repositories one and two to a server-side layout rendering service managed by MPC

Backend

Backend endpoints to the frontend were designed with AWS API Gateway and exposed through Amazon Route 53, redirecting the requests to the API Gateway. The API Gateway then resolves each request to DynamoDB. Some of the requests also go to AWS Lambda, depending on their functionality. Due to the sensitive content used in the platform, Amazon Web Firewall (WAF) protects API endpoints from possible attacks. To store the data, the platform uses Amazon DynamoDB and S3 Buckets.

Data Ingestion

The biggest challenge for the platform was data ingestion, since the platform ingests transcripts from presidential campaigns from up to 30 years back in time. An MPC proprietary application stream delivers these transcripts. The streams are pre-processed in an Amazon Elastic Compute Cloud (EC2) instance with a stream listener that verifies, cleans, and transforms the stream’s data structure. Then, the stream goes through two different processing paths:

  • The stream is queued through Amazon Single Notification Service (SNS) and Amazon Simple Queue Service (SQS) to be stored as a raw event using Kinesis Firehose, AWS Lambda, and S3 Buckets
  • The same stream is processed by a Key Phrase Extraction Lambda, then topic-tagged by AWS Sagemaker. Once processed and tagged, the stream is stored in the DynamoDB and indexed by AWS Kendra

Results and Benefits

The NLP platform development project was successful thanks to Wizeline’s strong development team of mostly full-stack engineers familiar with MPC’s environment. In four weeks, the team understood the repositories’ workflow, structure, and the local environment. As a result, the final product is a web platform that provided information about the US 2020 elections on particular topics, answering user questions. The platform launched in September 2020 with 40k+ users in the first two weeks.

Thanks to the integration of Amazon Kendra and Sagemaker, users can quickly type in their questions regarding any topic, and the platform delivers relevant quotes on the subject matter. What could have taken years of private development took only a few months by implementing these AWS tools and leveraging Wizeline’s AWS expertise. 

Wizeline engineers implemented QA and DevOps best practices on this project, which some of MPC’s other projects lacked. The customer found these practices valuable and decided to implement them in their new projects to create a secure code cycle. 

Finally, the successful implementation of this new platform has also inspired MPC to use a new business model based on DevOps best practices such as:

Continuous integration: Helps prevent faulty lines in the code base, improving the software quality

Continuous delivery: Extends the process of continuous integration by implementing all code changes in a testing environment, ensuring that all new implementations won’t hinder the software’s actual state

Independence of microservices: Creates a microservice architecture as a small service set to serve a single purpose – to enhance the reliability of the software

Infrastructure as Code (IaC): Provides and manages infrastructure as code through software development techniques such as version control and continuous integration, boosting the infrastructure implementation through standardized patterns

Observability: Monitors and registers metrics that help the business record encountered problems, which allows them to find faster solutions and trends to improve the software

Wizeline continues to work with MPC on a variety of other projects spanning platform development, data analytics, UX design, technical writing, and other areas.


Share this case study