Category: Data Science
Difficulty: Medium

#12 Tackling DataLemur SQL Challenges for Data Science Interviews πŸ§ πŸ’‘

Mastering SQL is a crucial skill for data science interviews. Today, we’re diving into three SQL challenges from DataLemur. These problems not only help you sharpen your SQL skills but also give you practical insights into real-world data science scenarios. Let’s explore the solutions! πŸš€


1. Histogram of Tweets [Twitter SQL Interview Question] 🐦

Description
In this challenge, we need to create a histogram that shows how many tweets users posted in 2022. The goal is to group users based on their tweet count and return the number of users in each group.

Schema:

  • tweets table:
    • tweet_id (integer)
    • user_id (integer)
    • msg (string)
    • tweet_date (timestamp)

Example Input:

tweet_id user_id msg tweet_date
214252 111 Am considering taking Tesla private at $420... 12/30/2021 00:00:00
739252 111 Despite the constant negative press covfefe 01/01/2022 00:00:00
846402 111 Following @NickSinghTech changed my life! 02/14/2022 00:00:00
241425 254 If the salary is so competitive... 03/01/2022 00:00:00

Solution:

SELECT tweet_count AS tweet_bucket, COUNT(user_id) AS users_num
FROM (
    SELECT user_id, COUNT(*) AS tweet_count
    FROM tweets
    WHERE tweet_date BETWEEN '2022-01-01' AND '2022-12-31'
    GROUP BY user_id
) AS tweet_histogram
GROUP BY tweet_count
ORDER BY tweet_count;

2. Data Science Skills [LinkedIn SQL Interview Question] πŸ§‘β€πŸ’»

Description
This challenge involves finding candidates who are proficient in Python, Tableau, and PostgreSQL. We’re tasked with listing candidates who have all these skills and ordering them by candidate ID.

Schema:

  • candidates table:
    • candidate_id (integer)
    • skill (varchar)

Example Input:

candidate_id skill
123 Python
123 Tableau
123 PostgreSQL
234 R
345 Python

Solution:

SELECT candidate_id
FROM candidates
WHERE skill IN ('Python', 'Tableau', 'PostgreSQL')
GROUP BY candidate_id
HAVING COUNT(DISTINCT skill) = 3
ORDER BY candidate_id;

3. Page With No Likes [Facebook SQL Interview Question] πŸ‘βŒ

Description
Here, we need to find Facebook pages that have received zero likes. The task is to return the IDs of these pages, sorted in ascending order.

Schema:

  • pages table:
    • page_id (integer)
    • page_name (varchar)
  • page_likes table:
    • user_id (integer)
    • page_id (integer)
    • liked_date (datetime)

Example Input:

page_id page_name
20001 SQL Solutions
20045 Brain Exercises
20701 Tips for Data Analysts

Solution:

SELECT page_id
FROM pages
LEFT JOIN page_likes ON pages.page_id = page_likes.page_id
WHERE page_likes.page_id IS NULL
ORDER BY page_id;

Wrapping Up πŸŽ‰

These DataLemur SQL challenges are a great way to test your skills and prepare for data science interviews. Whether it’s creating histograms of user activity, identifying top candidates, or filtering data with joins, these exercises cover key SQL concepts. Stay tuned for more data science insights! πŸ’ͺ

Written on September 27, 2024