Why aren’t you using pretrained models?

Pre­trained neural net­works have reached the point where they are good enough for many appli­ca­tions with­out fur­ther train­ing. Many mod­els trained on bil­lions of para­me­ters are freely avail­able. How­ever, not every com­pany has peo­ple with machine learn­ing expe­ri­ence. It is true that domain knowl­edge and care are required to build and deploy robust ML pipelines for end-users.

There is, how­ever, huge poten­tial in apply­ing sim­ple ML solu­tions to inter­nal or per­sonal chal­lenges. To show how sim­ply this can be done, let’s build a seman­tic search func­tion that could be of use for any­one tasked with writ­ing (Eng­lish) text.

A simple matter of programming, machine learning, pretraining

A dic­tio­nary is a valu­able tool for writ­ers. How­ever, mod­ern dic­tio­nar­ies that come with pop­u­lar oper­at­ing sys­tems have been found to be dry, func­tional, almost bureau­crat­i­cally sapped of color or pop”.

Another issue with dic­tio­nar­ies is that they are one-direc­tional. To find bet­ter words and expres­sions, you need to think of a word, look it up, and then chase ref­er­ences to explore the pos­si­bil­i­ties. What if, in addi­tion to this for­ward search, a com­puter could look in the other direc­tion (from the def­i­n­i­tions to the words)?

We can address these points by apply­ing a pre­trained model to build aseman­tic” ver­sion of Web­ster’s 1913 dic­tio­nary. What fol­lows is a quick overview of the idea. Then you may want to look at the code, even if you are not a pro­gram­mer: we need about 16 lines of Python to load the data, run it through a neural net­work, index it and start search­ing.

Building a reverse dictionary

We’ll use a tech­nique called sen­tence embed­ding to make the def­i­n­i­tions and exam­ples from Web­ster’s dic­tio­nary search­able. This is the seman­tic part of seman­tic search. Essen­tially, the mean­ing of a phrase is encoded into a vec­tor of num­bers, the out­put of a neural net­work.

With the Web­ster-Vec­tors in mem­ory (we get about 270000 from the dic­tio­nary used), we can now query this dataset by encod­ing a search phrase into a query vec­tor. To search for words, we com­pare how close the query is to vec­tors in the dataset.

To find sim­i­lar vec­tors, we run a near­est neigh­bor algo­rithm. It takes the query vec­tor as input, looks through our dataset, and pro­vides, for exam­ple, the top ten clos­est results. All that’s left to do for us is to return the words asso­ci­ated with these neigh­bor vec­tors. This will (ide­ally) result in a list of words close to the mean­ing of the search phrase.

As an exam­ple, the phraseI’m lost for words” yields: astound­ment, bewil­dered, blank, con­fus, dis­traught, per­plexly, stag­ger, stound. Please find the imple­men­ta­tion in this github repo.

Only a few lines of sim­ple code and some com­pute is needed to do some­thing use­ful with pre­trained mod­els. Not every­thing needs to be aboutbig data”, and by apply­ing ML you may find it can addbig mean­ing” to your daily chal­lenges. As the meth­ods ofAI” (really, neural net­works) have reached a point of con­sol­i­da­tion, now is a good time to give them a try even if you haven’t worked with ML before.


You can find me on LinkedIn and Github