Showing Top & Bottom values in one clear visualization

A simple pattern to compare extremes of a distribution

data-wrangling

functions

R-Hacks N.4

Author

Lucio Colonna

Published

January 23, 2026

This hack is based on my analysis on Kaggle analysis linked as follows (please see Chapter 4.3):
🔗 https://www.kaggle.com/code/lcolon/exploring-2024-software-engineer-salaries

When exploring a distribution, a very common question is:

Who are the top performers, and who are at the bottom?

Showing only the Top values might hid some context, while showing two separate charts can make comparison harder.

This R-Hack shows a simple and reusable pattern to display Top and Bottom values together in a single, clean visualization, as the one showed below:

Step 0 – Create an example dataset

Let’s start from a small, simulated dataset representing average salaries for different companies:

library(tidyverse)

set.seed(123)

df <- data.frame(
  company = (LETTERS[1:26]),
  avg_salary = round(runif(26, min = 60, max = 180), 0)
)

head(df)

  company avg_salary
1       A         95
2       B        155
3       C        109
4       D        166
5       E        173
6       F         65

Step 1 – Build Top & Bottom datasets in a single pipeline

In this step, we extract both the Top 10 and Bottom 10 values using a single, readable pipeline.

The idea is simple. Starting from the same dataset:

select the Top 10 by applying slice_max(), which returns all observations within the top 10 ranking positions, including any ties
select the Bottom 10 by applying slice_min(), returning all observations within the lowest 10 ranking positions, again including ties
assign a clear group label to each subset
combine the two subsets into a single dataframe for visualization and further analysis

plot_df <- 
  bind_rows(
    df %>% 
      slice_max(avg_salary, n = 10, with_ties = TRUE) %>%
      mutate(group = "Top 10"),
    
    df %>% 
      slice_min(avg_salary, n = 10, with_ties = TRUE) %>%
      mutate(group = "Bottom 10")
  )

head(plot_df)

  company avg_salary  group
1       X        179 Top 10
2       K        175 Top 10
3       T        175 Top 10
4       E        173 Top 10
5       P        168 Top 10
6       H        167 Top 10

Step 2 - Plot the data

In this step, starting from the dataframe created in the previous step, we build a simple visualization with ggplot:

plot_base <- ggplot(plot_df, 
                    aes(x = reorder(company, -avg_salary), y = avg_salary)) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = avg_salary), vjust = -0.4, size = 4) +
  facet_wrap(~ fct_rev(group), scales = "free_x") +
  labs(
    title = "Top and Bottom Companies by Average Salary",
    subtitle = "Simulated data example\n\n",
    y = "Average yearly salary",
    x = NULL
  ) +
  theme_minimal() +
  theme(
    panel.grid = element_blank(),
    axis.text.x = element_text(size = 10),
    axis.text.y = element_blank()
  )

plot_base

Step 3 - Generate gradient color palettes

Now that the base chart is working, we can make it more informative by adding two gradient color palettes:

one gradient for the Top group (blue shades)
one gradient for the Bottom group (red shades)

Instead of assigning a single color per group, we assign a slightly different shade to each bar. This creates a clean gradient effect while keeping the plot readable — even when ties produce more than 10 observations per group.

To do this, we:

generate a group-specific palette with colorRampPalette(), sized to the number of rows in each group (so it adapts if ties expand the selection)
group the data by group (Top vs Bottom)
assign colors row by row using row_number()
store the result in a new column called color

plot_df <- plot_df %>%
  group_by(group) %>%
  mutate(
    color = if_else(
      group == "Top 10",
      colorRampPalette(c("darkblue", "lightblue"))(n())[row_number()],
      colorRampPalette(c("darkred", "lightcoral"))(n())[row_number()]
    )
  ) %>%
  ungroup()

head(plot_df)

# A tibble: 6 × 4
  company avg_salary group  color  
  <chr>        <dbl> <chr>  <chr>  
1 X              179 Top 10 #00008B
2 K              175 Top 10 #131895
3 T              175 Top 10 #26309F
4 E              173 Top 10 #3948A9
5 P              168 Top 10 #4C60B3
6 H              167 Top 10 #6078BD

Step 4 - Style the chart using the new colors

At this point, we don’t want to rewrite the entire ggplot call. Instead, we:

reuse the base chart (plot_base)
replace its dataset with the updated plot_df (the one that now includes color) using the %+% operator
add a new geom_col() that maps fill to the color column
use scale_fill_identity() so ggplot uses the colors as they are listed in the color column

plot_styled <- (plot_base %+% plot_df) +
  geom_col(aes(fill = color)) +
  scale_fill_identity()

plot_styled

In short

Extract Top 10 and Bottom 10 values from the same dataset using slice_max() and slice_min(), including ties
Combine the two subsets into a single dataframe for a compact overview of the distribution extremes
Build a clean base plot to validate structure and layout
Enhance the visualization by applying gradient colors to distinguish Top and Bottom groups

Tip

If you want to stay up to date with the latest events from the Rome R Users Group, click here:

👉 https://www.meetup.com/rome-r-users-group/

And if you are curious, the full Kaggle notebook used for this tip is available here:

🔗 https://www.kaggle.com/code/lcolon/exploring-2024-software-engineer-salaries