score:2

Accepted answer

The reason you should use an ordinal scale instead of a linear scale is simple, although a lot of people get this wrong:

Bar charts, by their very nature, are made of bars representing a categorical variable. It means that the bars are positioned over a label that represents a categorical variable, i.e., a qualitative variable. When I said "a lot of people get this wrong" I was talking about the difference between a bar chart and a histogram: both use rectangles to encode data, but in a histogram, unlike a bar chart, the label represents a quantitative variable. At least a half dozen times a month I see someone here at S.O. posting a question about histograms which are in fact bar charts, or about bar charts which are in fact histograms.

So, given your data:

dataset = [5, 10, 13, 19, 21, 25, 22, 18, 15, 13];

The first bar corresponds to 5, the second bar corresponds to 10, and so on. The difference among the values of the bar is quantitative (for instance, "10 is twice as big as 5"), but the difference among the bars themselves is qualitative.

So, suppose we use the index number of each individual datum for labelling the bars in this bar chart (click "run code snippet"):

var w = 300,
  h = 200,
  padding = 20;
var svg = d3.select("body")
  .append("svg")
  .attr("width", w)
  .attr("height", h);

dataset = [5, 10, 13, 19, 21, 25, 22, 18, 15, 13];

var xScale = d3.scaleBand()
  .range([30, w])
  .domain(d3.range(dataset.length))
  .padding(0.2);

var yScale = d3.scaleLinear()
  .range([h - padding, padding])
  .domain([0, d3.max(dataset)]);

var bars = svg.selectAll("foo")
  .data(dataset)
  .enter()
  .append("rect")
  .attr("x", (d, i) => xScale(i))
  .attr("width", xScale.bandwidth())
  .attr("height", d => h - padding - yScale(d))
  .attr("y", d => yScale(d))
  .attr("fill", "teal");

var gX = svg.append("g")
  .attr("transform", "translate(0," + (h - padding) + ")")
  .call(d3.axisBottom(xScale));
  
var gY = svg.append("g")
  .attr("transform", "translate(30,0)")
  .call(d3.axisLeft(yScale));
<script src="https://d3js.org/d3.v4.min.js"></script>

We see numbers, from 0 to 9, in the horizontal axis. Now comes the important part: those numbers are not in fact numbers: they are qualitative variables. You have the bar number 0, the bar number 1, the bar number 2... but the difference between the bars (the bars per se, not their values) is qualitative, not quantitative (in that sense, 4 is not 2 times 2). They are just symbols, as if we had used "A", "B", "C" and so on for the labels.

Of course, you can simply sort the data to display an ascending or descending bar chart, but that fundamentally changes the relationship between each bar and its value. If you use an array of objects, you can keep the relationship. For instance, have a look at this next snippet: the bars are sorted, but the categorical variable of each bar is the same of your original data.

var w = 300,
  h = 200,
  padding = 20;
var svg = d3.select("body")
  .append("svg")
  .attr("width", w)
  .attr("height", h);

dataset = [5, 10, 13, 19, 21, 25, 22, 18, 15, 13];

var data = [];

dataset.forEach((d,i)=>data.push({index: i, value:d}));

data.sort((a,b)=>d3.descending(a.value, b.value));

var xScale = d3.scaleBand()
  .range([30, w])
  .domain(data.map(d=>d.index))
  .padding(0.2);

var yScale = d3.scaleLinear()
  .range([h - padding, padding])
  .domain([0, d3.max(data, d=>d.value)]);

var bars = svg.selectAll("foo")
  .data(data)
  .enter()
  .append("rect")
  .attr("x", (d) => xScale(d.index))
  .attr("width", xScale.bandwidth())
  .attr("height", d => h - padding - yScale(d.value))
  .attr("y", d => yScale(d.value))
  .attr("fill", "teal");

var gX = svg.append("g")
  .attr("transform", "translate(0," + (h - padding) + ")")
  .call(d3.axisBottom(xScale));
  
var gY = svg.append("g")
  .attr("transform", "translate(30,0)")
  .call(d3.axisLeft(yScale));
<script src="https://d3js.org/d3.v4.min.js"></script>

Therefore, that's why we use an ordinal scale (which defines the categorical variables) to create a bar chart.


Related Query