Youtube does have automatic transcription for videos. It’s not too hard to link this to a topic hierarchy (maybe they already do this). It seems like a hard problem at their scale, since unlike Spotify, the list of genres isn’t knowable.
I’ve been building a search engine for lectures as a research project. For a small list of videos I find that browsing topic taxonomy is really nice compared to the recommenders that try to guess your intent.
There are commercial systems for automatically tagging the text (e.g. Watson) which hierarchies which don’t go into niche areas – e.g. the Watson taxonomy tagger does 1,000 tags.
For more niche topics, I’ve explored Watson’s entity recognition system, e.g. to recognize the names of diseases. The advantage is it picks up terms it hasn’t seen- The problem is you can only identify entities that someone has trained a system to recognize.
The UI challenges are interesting as well. If spotified identified 100 genres that interested me, they could pick any arbitrary subset of playlists and I’d be pretty happy. If I used youtube to get home repair videos, and then they showed me videos about repairing parts of my house that aren’t broken, it’d get pretty irritating.