4 answers
Asked
821 views
How do you make a base recommendation system for an app without a large amount of data, assuming the data would be provided by the users?
#recommender-system
#computer-science
#machine-learning
Login to comment
4 answers
Updated
Martijn’s Answer
Hi, can you share the business context for the app (an app to order flowers, recommend books, medical, news ...)? That might lead to more directed answers to your question.
Without any context, here are some approaches to consider:
- recommend your top "sellers"
- ask domain experts to come up with logical pairings of items
- randomize the recommendations
- is there an alternative system/model to borrow ideas from - from offline to online, an app in an adjacent field, ...?
- use information about content similarity (as described in https://www.bibblio.org/blog/three-ways-build-effective-recommender-system-without-audience-data)
I would probably use several different approaches like this in an A/B/x experiment mode, than refine it based on the actual choices the app users start making
Google "recommender system without user data" or similar for other insights
Without any context, here are some approaches to consider:
- recommend your top "sellers"
- ask domain experts to come up with logical pairings of items
- randomize the recommendations
- is there an alternative system/model to borrow ideas from - from offline to online, an app in an adjacent field, ...?
- use information about content similarity (as described in https://www.bibblio.org/blog/three-ways-build-effective-recommender-system-without-audience-data)
I would probably use several different approaches like this in an A/B/x experiment mode, than refine it based on the actual choices the app users start making
Martijn recommends the following next steps:
Thanks for the advice. Let's say it's an app similar to CareerVillage, in the sense that it should give users recommendations to other users, professionals, topics, and questions. What sort of process may have been used when data was likely limited at the time of the website's creation?
Rashad
Updated
Liam’s Answer
Depending on the platform, there are data generation tools that you can use. One quick and dirty method, if you have excel or google sheets, is to enter a few rows of code, then write a function to alter it a little for each new row that you can fill down automatically by dragging or double clicking. Check out the randomizing tools in excel for some ideas in how you can quickly create a large dataset.
Another idea is to search for a dataset in a search engine, such as Google Dataset search engine...
https://datasetsearch.research.google.com/
Find something similar that what you need, modify it as necessary, then presto! you have yourself some cool data to play with.
Good Luck!
Another idea is to search for a dataset in a search engine, such as Google Dataset search engine...
https://datasetsearch.research.google.com/
Find something similar that what you need, modify it as necessary, then presto! you have yourself some cool data to play with.
Good Luck!
Thank you I will look into this!
Rashad
Updated
David’s Answer
Any design would be of limited use without a clear direction on how much and how data will be used. Try to evaluate how the application will be used. Is it going to collect huge amounts of data and save it to disk? If so ensure the system has good disk IO performance. Is it going to be an app that analyzes data or crunches numbers? If so chances are you will need a lot of CPU and memory capacity. When I am stuck working without a lot of info, I will normally deploy a 1-2 CPU VM dedicated for the app with 4G of RAM on Linux. for most cases that is sufficient to handle a decent application. I would call it a dev system and then determine if more or less is required for the eventual production release based on info collected as the application is developed. Without a clear picture of what the application will do, the chances of success are not very good.
Updated
Jigar’s Answer
Hi Rashad! This is actually a very common problem for recommender systems, called the "cold start" problem, where you need to train a system to be able to offer good-enough recommendations before the system itself has a large set of users and items to recommend to them. You can find a great discussion of this, and some proposed ways to mitigate the problem, on Wikipedia's page on this "cold start" problem: https://en.wikipedia.org/wiki/Cold_start_(recommender_systems)