One approach might be to let the \(n\)th entry in each document vector equal the frequency of the corresponding word in that document. There are other possible ways that the documents in the database might be represented. Note that the database search is nothing more than a single matrix multiplication, followed by a search of the vector \(DX\) to find the largest entries. The last three titles might also be reported as partial search matches. The third entry of \(DX\) is the largest, which means that the third webpage in the database best matches the list of keywords in the search. We will build a vector for each title, and then assemble them into a matrix \(D\). Suppose for the sake of our example that we have the following 6 titles. Our database then is represented by a \(n\times 10\) matrix that has a row for each title.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |