Wizeline builds NLP platform to highlight US presidential candidates’ stances on trending topics
The US Presidential Election is an important event to most citizens and, to an extent, the world in general. People want to know the ideas and opinions of the electoral candidates on specific topics. Due to people’s everyday duties, sometimes it is impossible to see the presidential debates in real-time. Thankfully, these debates are transcribed, but unfortunately, this information is not presented in a way that’s easy to digest, leading people to misunderstand the information or not get informed at all.
About the Customer
Our customer, a Media Publishing Company (MPC), was created in the late 19th century in New York City as a company specialized in conveying complex financial information for the general public. Thanks to their broad experience with financial data and political analysis, today MPC operates multiple media, news, and information brands worldwide and offers solutions for risk management, analytics research, and trading to other enterprises.
MPC partnered with Amazon Web Services (AWS) and Wizeline to create a web platform to interpret user questions about electoral candidates. The idea was to build a platform that could serve up relevant quotes mentioned in electoral debates and public appearances based on user-specific questions.
The platform’s main challenge was to understand questions written in natural language to simplify the user’s searching task. The second challenge was taking the main idea from the user’s question and finding it in the transcript database, including transcripts from up to 30 years back in time.
The platform should also offer users the latest relevant information sorted by candidate or issues. Users should be able to filter them by date, candidate, or keyword.
Lastly, implementing Machine Learning and Natural Language Processing (NLP) can be expensive and time-consuming. The project needed a way to work around these constraints and still deliver excellent results.
Wizeline and AWS engineers created a web platform using both AWS applications and MPC’s proprietary tools. The platform implements Amazon Web services in three sections: frontend, backend, and data ingestion.
The platform is open to the public, which means it doesn’t implement authentication or login services. To access the platform’s data, it exposes the backend endpoints via Amazon Route 53 DNS. The platform’s frontend is divided into three repositories:
- Repository one contains the interface’s React elements
- Repository two contains and manages the API calls needed to interact with users’ actions
- Repository three ties repositories one and two to a server-side layout rendering service managed by MPC
Backend endpoints to the frontend were designed with AWS API Gateway and exposed through Amazon Route 53, which redirects the requests to the API Gateway. Next, the API Gateway resolves each request to DynamoDB. Some of the requests also go to AWS Lambda, depending on their functionality. Due to the sensitive content used in the platform, Amazon Web Firewall (WAF) protects API endpoints from possible attacks. To store the data, the platform uses Amazon DynamoDB and S3 Buckets.
The biggest challenge for the platform is data ingestion. The platform ingests transcripts from presidential campaigns from up to 30 years back in time. These transcripts are delivered by an MPC proprietary application stream. The streams are pre-processed in an Amazon Elastic Compute Cloud (EC2) instance with a stream listener that verifies, cleans, and transforms the stream’s data structure. Then, the stream goes through two different processing paths:
- The stream is queued through Amazon Single Notification Service (SNS) and Amazon Simple Queue Service (SQS) to be stored as a raw event using Kinesis Firehose, AWS Lambda, and S3 Buckets
- The same stream is processed by a Key Phrase Extraction Lambda, then topic-tagged by an AWS Sagemaker. Once processed and tagged, the stream is stored in the DynamoDB and indexed by AWS Kendra
Results and Benefits
Wizeline built a strong developer team with mostly full-stack engineers familiarized with MPC’s environment. In four weeks, the team understood the repositories’ workflow, structure, and the local environment. As a result, the final product is a web platform that provided information about the US 2020 elections on particular topics, answering user questions. The platform kept US voters and the general public informed about what each electoral candidate had said on important issues.
Thanks to the integration of Amazon Kendra and Sagemaker, users can quickly type in their questions regarding any topic, and the platform delivers relevant quotes on the subject matter. What could have taken years of private development took only a few months by implementing these AWS tools.
Wizeline engineers implemented QA and DevOps best practices to create this project, which some of MPC’s projects lacked. They found it interesting and decided to implement them in their new projects to create a secure code cycle.
Finally, the successful implementation of this new platform has also inspired MPC to use a new business model based on DevOps best practices such as:
- Continuous integration: Helps prevent faulty lines in the code base, improving the software quality
- Continuous delivery: Extends the process of continuous integration by implementing all code changes in a testing environment, ensuring that all new implementations won’t hinder the software’s actual state
- Independence of microservices: Creates a microservice architecture as a small service set to serve a single purpose – to enhance the reliability of the software
- Infrastructure as Code (IaC): Provides and manages infrastructure as code through software development techniques such as version control and continuous integration, boosting the infrastructure implementation through standardized patterns
- Observability: Monitors and registers metrics that help the business record encountered problems, which allows them to find faster solutions and trends to improve the software
Wizeline was founded in 2014 on the notion that access to data unlocks better decision making. Today, over 1,000 Wizeliners bring a data-driven, design-centric, and cloud-native approach to building exceptional products for our customers.
We are the secret weapon of global enterprises and a trusted ally of high-growth startups. Our teams have mastered remote collaboration and have built strong community ties around our office locations in the U.S., Mexico, Vietnam, Thailand, Australia, and Spain. To learn more about Wizeline and our technology partners, visit wizeline.com/partners.