Applying Machine Learning Algorithms
for Personalized Search Results

We implemented a system that applies various algorithms learned in the Discrete Structures course to a real search system to provide personalized search results.

Spring 2024 Team G (6 members) Python

Project Overview

This project collects Google search history data through web crawling and analyzes the frequency of user-input keywords to provide personalized search results.

Main Goals

  • Practical application of discrete structure algorithms
  • Implementation of a personalized search system
  • Practical use of machine learning techniques
  • Collaboration experience through team project
1

Data Collection

Extracting search data
via web crawling

2

Data Analysis

Keyword frequency analysis
and pattern recognition

3

Result Delivery

Output of prioritized
personalized results

Applied Algorithms

Graph Theory

Represents web pages and hyperlinks as nodes and edges to analyze data relationships and perform efficient web crawling.

Applications:

  • Optimization of crawling paths
  • Data relationship analysis
  • Network structure modeling

Boolean Search

Uses AND, OR, and NOT operators to refine queries and provide highly relevant results.

Applications:

  • Query parsing
  • Keyword filtering
  • Improved result accuracy

Natural Language Processing

Analyzes the meaning of sentences and extracts sentences containing specific keywords.

Applications:

  • Semantic analysis
  • Keyword extraction
  • Sentence classification

Heuristic Search

Tracks keyword frequency to set priorities and enhance the efficiency of search results.

Applications:

  • Search result prioritization
  • Efficient data exploration
  • Performance optimization

String Matching

Efficiently finds sentences containing user-input keywords from collected text.

Applications:

  • Keyword search
  • Pattern matching
  • Sentence filtering

Sorting Algorithms

Sorts filtered words to present results in a more user-friendly way.

Applications:

  • Result sorting
  • Data structuring
  • User experience improvement

System Architecture

File I/O

Save/Load keywords

Web Crawling

Collect Google search results

Frequency Analysis

Track keyword frequency

Result Filtering

Select highly relevant information

Implementation Details

1. File I/O Function

Saves user-input keywords to a file and loads previously entered keywords at program execution to support continuous learning.

2. Web Crawling Function

Based on user-input keywords, automatically collects suggested queries and sentences containing the keywords from Google search results.

3. Keyword Frequency Analysis

Tracks the frequency of input keywords. If a keyword is entered beyond a certain threshold, it is prioritized to enhance personalization.

4. Result Output & Filtering

Extracts words from collected sentences and filters only those containing specific keywords to output highly relevant results.

Project Outcomes

Practical Algorithm Application

Successfully applied six algorithms learned in the Discrete Structures course to a real search system.

Improved Search Accuracy

Significantly enhanced the accuracy and relevance of search results through keyword frequency tracking and personalization features.

Teamwork & Collaboration

Six team members completed the project through effective collaboration, each contributing in their area of expertise.

Practical Implementation

Implemented a fully functional search system using Python and provided accessibility through a QR code.