Given document text (and images?), get a concise topic description
E.g. give it a bunch of text about WWII with a focus on specific battles, it might return “World War II, with a focus on specific battles”
Bag of words
Just feed all the text to an LLM, ask it to describe the topic in only a few words