Can you shed a little light on how movie search results are sorted?
It generally seems pretty smart and taking result[0] usually works, especially for current movies. But for older movies I've noticed that even exact name matches can get buried a bit and I have to layer on my own system for working out a best match. A good example is "Camille 2000".
I don't need to know the exact algorithm, just what factors are considered so I don't end up replicating that in my own best-match code.
Hi cliffw,
Our Data Architect, Maya, wrote a blog post a few months ago on how our search works. The link is below. It is skewed toward current movies, as you've already discovered.
What are you using for your best-match code? Have you heard of the Levenshtein distance algorithm? In the case of direct movie title matchets, you may find it useful in matching up what you're looking for with the results returned from the API Search. See http://en.wikipedia.org/wiki/Levenshtein_distance
Thanks, Steve! Luckily I don't have to get into user-entered name matching too much. I have a primary data source that I'm supplementing with Rotten Tomatoes data. So the best method I've found is to find data points where those two sets overlap (such as run time or cast members) to determine when I have a match.
Can you shed a little light on how movie search results are sorted?
It generally seems pretty smart and taking result[0] usually works, especially for current movies. But for older movies I've noticed that even exact name matches can get buried a bit and I have to layer on my own system for working out a best match. A good example is "Camille 2000".
I don't need to know the exact algorithm, just what factors are considered so I don't end up replicating that in my own best-match code.
Message edited by cliffw 2 years ago
Tags
Steve N. – 2 years ago
Hi cliffw,
Our Data Architect, Maya, wrote a blog post a few months ago on how our search works. The link is below. It is skewed toward current movies, as you've already discovered.
http://root.rottentomatoes.com/2011/03/15/search-for-the-perfect-tomato/
Steve N. – 2 years ago
What are you using for your best-match code? Have you heard of the Levenshtein distance algorithm? In the case of direct movie title matchets, you may find it useful in matching up what you're looking for with the results returned from the API Search. See http://en.wikipedia.org/wiki/Levenshtein_distance
cliffw – 2 years ago
Thanks, Steve! Luckily I don't have to get into user-entered name matching too much. I have a primary data source that I'm supplementing with Rotten Tomatoes data. So the best method I've found is to find data points where those two sets overlap (such as run time or cast members) to determine when I have a match.