In the final year of my Master’s degree, I developed a sentiment analysis engine as part of my academic professional practice. The engine forms the backend of a sentiment analysis web application which was initially developed by students at Deakin University for an external (industry) client.
The engine (and associated IP) has since been acquired by a startup and will become the core of a commercial service. The code and specific engineering is now proprietary IP and cannot be discussed here. Instead I can provide a basic overview of the problem space and solution.
Data-based decison-making has traditionally been based on tightly controlled data gathering and highly-structured data (think a clinical trial). At the same time, communication technologies have provided for the wholesale collection of vast amounts of largely unstructured feedback (think commentary), solicted and unsolicited.
A web-based sentiment analysis service/platform that first targets the evaluation industry - those that are already in the business of providing/implementing sentiment analysis. At the core of this solution is a 'sentiment analysis engine' that takes unstructured feedback data and proprietary domain dicionaries to produce domain-based and overall sentiment analysis scores.
While my role was to lead the back-end development team, I also had the opportunity to undertake some business analysis - specifically looking at rapid development and implementation of a flexible and scalable solution using a small team of developers, delivering a prototype solution on AWS (Amazon Web Services).
Engine output is available to clients for integration into their own systems (via manual download or API). The service also provides for extensive online analytics, allowing evaluation teams to draw insight immediately without futher local computation, and also serves as a filtration mechanism; human review being necessary in order to obtain precise/nuanced understanding of written work.
The service provides extensive custom pre-processsing and post-processing. At its most basic, users will be able to filter comments of particular interest and insight, ignoring data of insignificant value. Below is a very small representative output from university unit feedback with accompanying sentiment scores.
COMMENT |
OUTCOMES |
STAFF |
UNIT DESIGN |
ASSESSMENT |
SUPPORT |
OVERALL SENTIMENT |
|---|---|---|---|---|---|---|
| The entire unit needs to be re-visited. The explanations of every assignment so unclear that the forums were inundated with unsure students, having a team presentation on the last teaching week is extremely bad timing, giving that most of us are already involved in revision for other units and other work commitments it was very difficult to organise an 'online team' , it was another non-enjoyable part of the subject. | 0.00 | 0.00 | -0.01 | 0.12 | -0.25 | -0.42 |
| The teaching staff are excellent. The classes, seminars and prac labs were also very informative. The prac labs were particularly helpful for hands on experience. Reading resources were also very good, including Prescribed Text. | 0.25 | 2.24 | 0.74 | 0.00 | 0.00 | 0.89 |
| Very well run unit. Very engaging with a thoughtful lecturer who was aware of the learning needs in an online environment. Online setup promoted discussion. Lecturer knew what she was talking about. | 0.00 | 1.23 | 0.74 | >0.00 | 0.00 | 0.82 |
| Make a tutorial time clear in the scheduling and stick with it and be there for a good period of time. I need cloud study as I am juggling around work and children. When you do conduct tutorial, make sure the audio works. It only worked one night I was on. | 0.00 | 0.00 | -0.26 | 0.00 | 0.00 | -0.17 |
| The order of topics was logical and allowed students to build on their knowledge throughout. Although many of the topics were delivered by different lecturers, links to earlier topics were consistently identified and helped to integrate the underlying biochemistry and physiology. I particularly liked the use of papers v text books and thought this reinforced the learning goals of the two written assessments. | 0.00 | 0.37 | 0.00 | 0.00 | 1.12 | 0.73 |