External merge sort algorithm used to sort large number of records - with options to use multiple threads.
-
Updated
Mar 20, 2021 - Python
External merge sort algorithm used to sort large number of records - with options to use multiple threads.
A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
Created a mini wikipedia search engine on wikipedia data dump of 2020 of size 40 GB.Results are retrived in less than a sec.
Sorting algorithms in python
Add a description, image, and links to the external-merge-sort topic page so that developers can more easily learn about it.
To associate your repository with the external-merge-sort topic, visit your repo's landing page and select "manage topics."