Sentence embedding

<h2 id="applications">Applications</h2>
<p>In recent years, sentence embedding has seen a growing level of interest due to its applications in natural language queryable knowledge bases through the usage of vector indexing for semantic search. <a href="/facts/LangChain/QHRFgeYr">LangChain</a> for instance utilizes sentence transformers for purposes of indexing documents. In particular, an indexing is generated by generating embeddings for chunks of documents and storing (document chunk, embedding) tuples. Then given a query in natural language, the embedding for the query can be generated. A top k similarity search algorithm is then used between the query embedding and the document chunk embeddings to retrieve the most relevant document chunks as context information for <a href="/facts/Question_answering/1gplmc6L">question answering</a> tasks. This approach is also known formally as <a href="/facts/Retrieval-augmented_generation/IhBo9cy5">retrieval-augmented generation</a><a class="footnote-ref" id="fnref:11" href="#fn:11"><sup>11</sup></a>
</p><p>Though not as predominant as BERTScore, sentence embeddings are commonly used for sentence similarity evaluation which sees common use for the task of optimizing a <a href="/facts/Large_language_model/WnogWVJY">Large language model</a>'s generation parameters is often performed via comparing candidate sentences against reference sentences. By using the cosine-similarity of the sentence embeddings of candidate and reference sentences as the evaluation function, a grid-search algorithm can be utilized to automate <a href="/facts/Hyperparameter_optimization/bzcJJrxs">hyperparameter optimization</a> .
</p>
<h2 id="evaluation">Evaluation</h2>
<p>A way of testing sentence encodings is to apply them on Sentences Involving Compositional Knowledge (SICK) corpus<a class="footnote-ref" id="fnref:12" href="#fn:12"><sup>12</sup></a>
for both entailment (SICK-E) and relatedness (SICK-R).
</p><p>In <a class="footnote-ref" id="fnref:13" href="#fn:13"><sup>13</sup></a> the best results are obtained using a <a href="/facts/Bidirectional_recurrent_neural_networks/B6ArJM8k">BiLSTM network</a> trained on the <a href="https://nlp.stanford.edu/projects/snli/">Stanford Natural Language Inference (SNLI) Corpus</a>. The <a href="/facts/Pearson_correlation_coefficient/Igf0xDPc">Pearson correlation coefficient</a> for SICK-R is 0.885 and the result for SICK-E is 86.3. A slight improvement over previous scores is presented in:<a class="footnote-ref" id="fnref:14" href="#fn:14"><sup>14</sup></a> SICK-R: 0.888 and SICK-E: 87.8 using a concatenation of bidirectional <a href="/facts/Gated_recurrent_unit/afweFlCM">Gated recurrent unit</a>.
</p>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Distributional_semantics/klV0l6o7">Distributional semantics</a></li>
<li><a href="/facts/Word_embedding/7uRcBPqo">Word embedding</a></li></ul>
<h2 id="external-links">External links</h2>

<ul><li><a href="https://github.com/facebookresearch/InferSent">InferSent sentence embeddings and training code</a></li>
<li><a href="https://arxiv.org/abs/1803.11175">Universal Sentence Encoder</a></li>
<li><a href="https://github.com/Maluuba/gensen">Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Barkan, Oren; Razin, Noam; Malkiel, Itzik; Katz, Ori; Caciularu, Avi; Koenigstein, Noam (2019). "Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding". arXiv:1908.05161 [cs.LG]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
<li id="fn:2"><p>The Current Best of Universal Word Embeddings and Sentence Embeddings <a href="https://medium.com/huggingface/universal-word-sentence-embeddings-ce48ddc8fc3a" target="_blank">https://medium.com/huggingface/universal-word-sentence-embeddings-ce48ddc8fc3a</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></p></li>
<li id="fn:3"><p>Cer, Daniel; Yang, Yinfei; Kong, Sheng-yi; Hua, Nan; Limtiaco, Nicole; John, Rhomni St.; Constant, Noah; Guajardo-Cespedes, Mario; Yuan, Steve; Tar, Chris; Sung, Yun-Hsuan; Strope, Brian; Kurzweil, Ray (2018). "Universal Sentence Encoder". arXiv:1803.11175 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></p></li>
<li id="fn:4"><p>Wu, Ledell; Fisch, Adam; Chopra, Sumit; Adams, Keith; Bordes, Antoine; Weston, Jason (2017). "StarSpace: Embed All the Things!". arXiv:1709.03856 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></p></li>
<li id="fn:5"><p>Sanjeev Arora, Yingyu Liang, and Tengyu Ma. "A simple but tough-to-beat baseline for sentence embeddings.", 2016; openreview:SyK00v5xx. <a href="https://openreview.net/forum?id=SyK00v5xx" target="_blank">https://openreview.net/forum?id=SyK00v5xx</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></p></li>
<li id="fn:6"><p>Trifan, Mircea; Ionescu, Bogdan; Gadea, Cristian; Ionescu, Dan (2015). "A graph digital signal processing method for semantic analysis". 2015 IEEE 10th Jubilee International Symposium on Applied Computational Intelligence and Informatics. pp. 187–192. doi:10.1109/SACI.2015.7208196. ISBN 978-1-4799-9911-8. S2CID 17099431. <a href="978-1-4799-9911-8" target="_blank">978-1-4799-9911-8</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></p></li>
<li id="fn:7"><p>Basile, Pierpaolo; Caputo, Annalina; Semeraro, Giovanni (2012). "A Study on Compositional Semantics of Words in Distributional Spaces". 2012 IEEE Sixth International Conference on Semantic Computing. pp. 154–161. doi:10.1109/ICSC.2012.55. ISBN 978-1-4673-4433-3. S2CID 552921. <a href="978-1-4673-4433-3" target="_blank">978-1-4673-4433-3</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></p></li>
<li id="fn:8"><p>Reimers, Nils; Gurevych, Iryna (2019). "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks". arXiv:1908.10084 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></p></li>
<li id="fn:9"><p>Mikolov, Tomas; Chen, Kai; Corrado, Greg; Dean, Jeffrey (2013-09-06). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></p></li>
<li id="fn:10"><p>Ionescu, Radu Tudor; Butnaru, Andrei (2019). "Vector of Locally-Aggregated Word Embeddings (". Proceedings of the 2019 Conference of the North. Minneapolis, Minnesota: Association for Computational Linguistics. pp. 363–369. doi:10.18653/v1/N19-1033. S2CID 85500146. <a href="https://aclanthology.org/N19-1033" target="_blank">https://aclanthology.org/N19-1033</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></p></li>
<li id="fn:11"><p>Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich; Lewis, Mike; Yih, Wen-tau; Rocktäschel, Tim; Riedel, Sebastian; Kiela, Douwe (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". arXiv:2005.11401 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></p></li>
<li id="fn:12"><p>Marco Marelli, Stefano Menini, Marco Baroni, Luisa Bentivogli, Raffaella Bernardi, and Roberto Zamparelli. "A SICK cure for the evaluation of compositional distributional semantic models." In LREC, pp. 216-223. 2014 [1]. <a href="https://www.researchgate.net/profile/Marco_Marelli/publication/262484733_A_SICK_cure_for_the_evaluation_of_compositional_distributional_semantic_models/links/0deec537caccde8684000000.pdf" target="_blank">https://www.researchgate.net/profile/Marco_Marelli/publication/262484733_A_SICK_cure_for_the_evaluation_of_compositional_distributional_semantic_models/links/0deec537caccde8684000000.pdf</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></p></li>
<li id="fn:13"><p>Conneau, Alexis; Kiela, Douwe; Schwenk, Holger; Barrault, Loic; Bordes, Antoine (2017). "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data". arXiv:1705.02364 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></p></li>
<li id="fn:14"><p>Subramanian, Sandeep; Trischler, Adam; Bengio, Yoshua; Christopher J Pal (2018). "Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning". arXiv:1804.00079 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:14" class="footnote-back-ref">↩</a></p></li>
</ol>

Sentence embedding open-in-new

Sentence embedding