Improving user experience with a cloud data service and recommendation engine
DesignSpark is part of RS Group PLC who are a global omni-channel provider of product and service solutions for designers, builders and maintainers of industrial equipment and operations. The DesignSpark brand can be described as a community platform then engages with Engineers and Innovators by providing design tools, technical resources and technical content to support them in the creation of their Engineering Design projects
Datasparq was engaged to help DesignSpark enhance the user experience on their main site. As with many content rich sites, giving users the ability to find relevant information but also discover new content, increases engagement and allows users to contribute to the community more effectively.
Furthermore, by consuming content the users get the chance to learn, interact and consecutively buy components from Electrocomponents PLC main site. Getting this right would allow better engagement and thus better monetisation of the community.
Datasparq suggested building a personalised recommendation engine that would allow users to discover relevant content based on their interests and the interests of their peers thus improving their online experience and the way they are interacting with the site.
To achieve this Datasparq designed a rich article metadata & classification engine. This was needed to support enhanced recommendations, analytics and content commissioning. To make this a reality, we built a data service that continuously analyses all the articles’ content of the website, understands them semantically and creates ontological models around them. Then, we used various algorithms to provide article recommendations to users based on their reading history.
The platform has been in Production since January of 2022 helping DesignSpark engage their community of engineers through daily recommendations.
The content extraction and semantic analysis layer is happening during a batch process but recommendations are happening in real-time, for each online user, and are served through an API that adheres to the agreed SLA of sub-second latency.
The service and the data pipeline were designed and delivered on Google Cloud Platform (GCP). By following Datasparq’s engineering principles, best practices and by leveraging our “elements of engineering” (link) we deployed the new service on GCP using serverless components and IaaC approach through Terraform.
The API serving happened through Google Cloud Run which allowed us to run stateless containers with a bit more flexibility than the current version of Cloud Functions allowed.
We are ingesting, daily, user activity data from Google Analytics (GA4) to Big Query where we apply our topic extraction methods and run our classification models before we save the output in Postgres Cloud SQL instance ready to be served to the users through our API
- Cloud Functions
- Cloud Storage
- Firestore for our Houston orchestration tool
- Memorystore for our Houston orchestration tool
- Cloud Run
- Cloud SQL
- Cloud Secrets
- Cloud Build
- Identity and Access Management
Timely recommendations are reaching the users of the website through a sub-second latency API built with python’s FastAPI and gunicorn for concurrency.
With the current architecture we can support multiple algorithms and versions of trained models, allowing us to perform in a live environment robust A/B testing.
DesignSpark now has an AI platform that in a similar manner can support AI use-cases that rely on API serving and batch model training.