The PostgreSQL question that will be answered in this post can be found here, and it is the following:
This is the table that we start with, named facebook_friends:
First, we build a Common Table Expression (CTE) that will make it easier to obtain the final result later. With this CTE, we group by each of the two user columns and then merge them with the UNION operator:
WITH all_users AS (
(SELECT
user1 AS user_id,
COUNT(user2) AS n_friends
FROM facebook_friends
GROUP BY 1)
UNION
(SELECT
user2 AS user_id,
COUNT(user1) AS n_friends
FROM facebook_friends
GROUP BY 1)
)
Looking at the initial table, we know that friendships are not duplicated (as, for example, users 2 and 1 in one row and 1 and 2 in another row in the first and second columns respectively). Therefore, we can group by each of the two user columns and count the number of friends that each user has. After that, we combine the two result sets with the UNION operator, which gives us the following table:
Now we can simply group by the user_id column and sum the number of friends to get the desired result. This is the full query:
-- Creating a CTE from which we can then calculate the number of friends each user has
WITH all_users AS (
(SELECT
user1 AS user_id,
COUNT(user2) AS n_friends
FROM facebook_friends
GROUP BY 1)
UNION
(SELECT
user2 AS user_id,
COUNT(user1) AS n_friends
FROM facebook_friends
GROUP BY 1)
)
-- Now we can get the requested table very easily
SELECT
user_id,
SUM(n_friends) / (SELECT COUNT(DISTINCT(user_id)) FROM all_users) * 100 AS popularity_perc
FROM all_users
GROUP BY 1
ORDER BY 1;
After executing this query, we obtain the requested table:
As we can see, user 1 is the most popular, as he is friends with 55.56% of the total users on the platform.
Deja un comentario