This is like a 1D version of what a scientific computing person might describe as the distinction between node-centered ("standard" or "mid-tread" in this post) vs cell-centered ("alternative", "mid-riser") samples: do consider values to be at the middles of bins (or middles of triangles, or middle of tetrahedra), or the boundaries between intervals (or vertices of triangles, or of tets).
In a scientific computing setting it would be insane to start doing data processing without knowing how to interpret the values. In the context of audio signal processing, if you just get a stream of integers, you'd have to know the representational intent of those integers (mu-law encoding or linear?) if you're going to compute anything about the underlying signal. The meta-data accompanying the values would hopefully provide the answer.
But with 8-bit pixel values, absent any meta-data from a competent file format that can communicate the representational intent, we're adrift and there's no right answer (like the author says). Certainly no one can fault you for picking whichever one seems to be give better results for your application, but you can raise awareness that bits without context have had their meaning undermined.
moefh 10 hours ago [-]
This problem of what exactly a color value means is mostly inconsequential when you have 8 bits per component, the difference in the denominator being either 255 or 256 makes the errors tiny, you must have really good color perception and get really close to the screen to see any difference at all, and your monitor/phone screen is probably not calibrated anyway, so who cares.
It becomes a pain in the ass when you're generating a VGA signal with a microcontroller with 8 color output pins (3 red, 3 green, 2 blue). The meaning of a color value is very real in this setup: it corresponds to a voltage level you must send to the VGA monitor, 0V-0.7V.
So the blue channel will map (0->0V, 1->0.23V, 2->0.47V, 3->0.7V), and the red/green will map (0->0V, 1->0.1V, ..., 7->0.7V). Notice how none of the blue voltages match any of the red/green ones (other than the extremes)? That means you don't get to see any pure grays -- the closest ones will have bit of blue or yellow tint, depending on the direction of the difference.
Not only that, any gradients at all (other than the ones not mixing blue with the other channels) will be noticeable off: for example, the closest colors in the line between pure red to pure white will all be slightly orange or purple.
> That means you don't get to see any pure grays -- the closest ones will have bit of blue or yellow tint, depending on the direction of the difference.
OMG I remember as a kid staring at static-y CRT displays, and seeing these faint blue and yellow lines at the borders of them. I’d always wondered why they appeared and why they were specifically blue and yellow. I finally know! (at least, assuming those specific artifacts are due to the same thing)
adastra22 2 hours ago [-]
specifically on the edges? I would guess that is the phosphor layout, that the 'gray' beam is hitting a blue phosphor but not the red & green, or vice-versa.
chongli 8 hours ago [-]
You forgot about gamma correction. Before converting a value in the range of 0-255 into a voltage, PCs typically raise that value to the power of 2.2. This makes the difference between small values and large values far more apparent:
2^2.2 = 4.595, 255^2.2 = 196,964.699
spider-mario 51 seconds ago [-]
Differences between small and large values are irrelevant to the point being made here, though. Much more relevant is the difference between nearby values, and the gamma just gets that closer to logarithmic perception, instead of perceptual steps being disproportionately large for small values.
(This may be more apparent when you frame gamma as being applied in the 0-1 range, so it doesn’t really turn 2 into 4.595 and 255 into ~200k; it turns (2/255)≈0.00784 into (2/255)^2.2 ≈ 0.0000233, and leaves (255/255)=1 as is.)
londons_explore 2 hours ago [-]
Dithering in time seems like the best solution to this problem. Delta sigma modulation per pixel can be done reasonably easily.
Changing at 30Hz I doubt a human can tell the difference between slightly blue and slightly yellow.
tdeck 6 hours ago [-]
> Notice how none of the blue voltages match any of the red/green ones (other than the extremes)? That means you don't get to see any pure grays
I assume this is why RGBI color was so common in the 80s.
7 hours ago [-]
fps-hero 7 hours ago [-]
This was a genuinely thought provoking article. I had to challenge some personal assumptions.
Coming from an electrical engineering background, I disagree with how the author presented "Two types of quantizers". Mathematically rigorous, but not grounded in practical systems.
In ADCs, there is always an inherent +-1/2 LSB of quantisation uncertainty. The transfer characteristic is always mid-tread sampling, or at least I haven't come across any counter examples. This is true for bipolar or unipolar ADCs.
The lowest code is negative voltage reference, and the highest code the positive reference. The transfer characteristic plot will show what the author has demonstrated, that the highest and lowest bins will effectively be 1/2LSB in width.
In a unipolar system, this has the consequence of not being able to represent the midpoint voltage precisely, or in other words, the gray problem. In a bipolar system, 0V will be mid-tread N/2 value, but that doesn't mean it has "256 ranges".
So, I'll be sticking with (VREF+ - VREV-) * k / (2^N - 1). Or in other words I agree with the normalisation by 255. It's the fence post error all over again, you have N values, but N-1 ranges. If you have less ranges than you do values, you need to distribute 1 of those ranges between two values, hence the 1/2LSB range endpoints.
kroeckx 57 minutes ago [-]
All ADCs I have looked at document that they can't represent the positive full scale. For instance, for an 8 bit ±1 V ADC, -128 represents -1 V, +127 represents 127/128=0.99219 V. The transition from 126 to 127 happens at 1.5 LSB from the positive full range. 1 LSB difference represents 1/128 = 0.00781 V difference, and not 2 / 255 = 0.00784 V.
But if you actually care about what the voltage (and uncertainty) is, most of this is difference is mostly pointless, you're reference will have a bias, there are linearity errors and so on. 1 LSB will not match either the 1/128 or 2/255, you will need parameters to compensate for it.
BearOso 9 hours ago [-]
There is a fallacy here in assuming there's 256 steps from 0 to 255. That's not true, there's 256 values that can be represented in 8 bits, and 255 steps (spaces between those values) from 0 (black) to (255) pure white. Thus, the division by 255 isn't problematic. Of course, 128 isn't half grey, it isn't in 0-255 and quantized 8-bit values are almost always in sRGB, not linear perceptive space.
This is the same kind confusion that happens with sampling positions in modern APIs, where the location is specified in coordinates and not in pixel centers.
herf 14 hours ago [-]
I'll argue for the +0.5 solution. First, I don't like half-sized intervals at the edges, and second, a 255-based representation is typically a SDR (not HDR) image.
RGB values represent luminances against some adapted state, and a "zero" in a daylit scene is not "zero luminance" - it's just about 0.001x as bright as the brightest point - it's millions of photons, way more than zero. In a sense our eyes experience contrast on a sliding scale, and there is no absolute zero in the system. For example, broadcast systems historically used 16-235 as their luminance range for SDR. I think any argument that says "we must have zero" is going to have a bias, but I don't think zero is needed for most things.
pixelesque 12 hours ago [-]
As someone with a lot of experience in this area doing image processing and rendering for VFX (including writing image readers and writers for my own software and commercial VFX software), I think you might be forgetting that colourspace conversion (to sRGB 'linear' rec709 for old-school SDR, but other more wider gamuts for newer formats) would happen after this, so the 'squish' of the dynamic range would happen after loading.
Also, a lot of workflows for image processing and compositing do assume that 0 means zero, whether correctly or not (often incorrectly). So there are often assumptions that for 8-bit, 0u maps to 0.0f and 255 maps to 1.0f for things like masking or alpha: as soon as you have 0 values which become just over 0.0, you then have artifacts because some code somewhere is using a hard threshold of 0.0 to mask some other operation, and vice-versa for 1.0 with alpha, where suddenly because the 255 values are no longer 1.0f, you have very slightly see-through objects (often only visible in certain situations or when pixel-peeping) after pre-multiplication.
(Same thing can happen when 254 becomes 1.0f after +0.5 with masking).
yuriks 4 hours ago [-]
I think more to the point, if 0 doesn't represent 0.0, and 255 doesn't represent 1.0, congratulations you've just lost your additive and multiplicative identities and most of the math used in colors falls apart.
The argument for 0-256 feels compelling when thinking about the physical display, but it seems like a very poor fit for any digital image processing or rendering.
herf 12 hours ago [-]
good point - alpha is a notable exception, it is not luminance
pornel 10 hours ago [-]
Although the post focuses on RGB, the same quantization issue exists for any type of signal being mapped between discrete and continuous representations.
The issue isn't in having a representation for 0 photons, but about maximizing information stored in a byte. Ideally you shouldn't be underutilizing the byte value 0, nor add bias to data that should have been assigned to the 0th bucket, regardless of what it represents (you could have a color space that goes from bright to super bright, and still want to ensure that every byte represents equal chunk of your brightness range).
PaulDavisThe1st 9 hours ago [-]
Yep, the exact same problem arises in digital audio, mapping between integer sample formats and the floating point representation that is generally used internally.
yxhuvud 14 hours ago [-]
Both solutions add 0.5, the difference is where in the process it happens.
amavect 13 hours ago [-]
I agree. Additionally, both 0.0 and 1.0 don't really exist for dithered signals, so a byte should map to [0.5, 255.5] before division by 256. This also solves the signed integer asymmetry, as a signed byte maps to [-127.5, 127.5] before division by 128. I wonder if audio DSP folks have done this already.
amavect 11 hours ago [-]
Thinking about this more, dithering requires negative values to cancel out when adding. Works for audio, but color doesn't have negative numbers.
somat 10 hours ago [-]
It is still frequency, where it would have negative values. but I doubt any color handling algorithms deal with it as a frequency. Rightfully so, the physical wetware for decoding images is very different than that for decoding audio. Well... not that different if you think of audio as a single pixel monochrome image.
Now I am imagining a weird alternate history where we treat audio like we treat color. OK take three bytes which encode how loud the sound is, one for lows, one for mids and one for highs where lows mids and high frequencies are picked to match human ear response.
dylan604 11 hours ago [-]
> broadcast systems historically used 16-235
For 8-bit, 16 maps to 7.5IRE which is the well understood legal black. Mapping 235 means they mapped peak to 110IRE. This is based on a 0-120IRE scale. This gets weird as the broadcast limit for video was 100IRE allowing for the chroma to reach 110IRE. So if you're trying to limit your white values to 235, that'll be higher than is broadcast safe. Of course, nobody cares about NTSC broadcast limits any more. However, to this day, I still see out of spec tapes marked as "broadcast master" that have been ingested for streaming use. It drives me crazy to this day, and it's only getting worse as people don't even have scopes to adjust the VTR's TBC properly.
keithwinstein 9 hours ago [-]
> For 8-bit, 16 maps to 7.5IRE which is the well understood legal black. Mapping 235 means they mapped peak to 110IRE.
The "16" digital black level is independent of the "7.5 IRE" analog setup. E.g. in Japan with an 8-bit "NTSC-J" Rec. 601 system, my understanding is that 16 still maps to E'Y = 0 which is now at 0 IRE, and 235 is still E'Y = 1 at 100 IRE.
variaga 10 hours ago [-]
Ugh. Sudden flashbacks to having to switch analog output between Japanese NTSC (no pedestal) and US NTSC (with pedestal) without getting weird noise in the black regions.
But IIRC the MPEG-2 standard had luma==235 -> 100IRE for all of the analog formats (pal/ntsc-j/ntsc/secam) so I'm not sure why you say that would violate the broadcast limits?
dylan604 5 hours ago [-]
Simply because the math works that 7.5IRE on a 120IRE scale maps to 16 8-bit that 110IRE maps itself to 235 8-bit on a simple scaling equation. To get 235 8-bit to match to 100IRE means some sort of exponential scaling. At that point, I stuck with the linear scale and moved on with the keep it simple stupid mindset
infinet 12 hours ago [-]
Interesting idea, but somehow I feel the world is shaking. For the processing program, what used to black(0.0) and white(1.0) has became very dark gray and very bright gray.
kazinator 10 hours ago [-]
They are not half sized at the edges, unless negative black bothers you.
themafia 13 hours ago [-]
> In a sense our eyes experience contrast on a sliding scale
There's a whole visual center to check the amount of incoming light and adjust your pupils for you. It's intentionally reactive.
> and there is no absolute zero in the system.
There maybe is. I think we call that "blind."
> broadcast systems historically used 16-235 as their luminance range for SDR
Mostly because it was a fully analog system and these all translate down to signal voltage. Jokingly NTSC used to be referred to as "Never Twice the Same Color" due to being a compromise bolted onto the side of an already compromised system.
a_conservative 11 hours ago [-]
>> and there is no absolute zero in the system.
> There maybe is. I think we call that "blind."
If you go looking into that, you'll see that the reality is far far more complex [0]
"The number of people with no light perception is unknown, but it is estimated to be less than 10 percent of totally blind individuals."
"While in theory there are cases where you might want to use either type of quantization, if you are in games don't do that!
The reason is that the GPU standard for UNORM colors has chosen "centered" quantization, so you should do that too."
Nuthen 13 hours ago [-]
That was a fun article to read of something I haven't had to think about in a while. It brought to mind moments in game development of having pixel art needing to be drawn on an integer value despite the game logic using floating point math. I tried something similar to the +0.5 in places so that it wouldn't look as bad (especially when there's a moving camera, which also needed to be truncated..).
I also enjoyed the 2002 article by Jonathan Blow [1] that's linked at the bottom. The visualization from the first article helped a lot once this started to go more in-depth.
Dammit, I have an 80%-written article covering the same issue but for ADCs, and had to put it aside for the past few months. There's historical precedent here from the 1960s and 1970s, and in large part it involves testing and definitions of gain and offset error in ADCs.
Someday I'll finish... :-(
dudu24 14 hours ago [-]
If you have a ruler and it goes to 12 inches, you should normalize by the length L and not by 13, the number of points on the ruler.
Timwi 12 hours ago [-]
I'm confused by that analogy. Is the “ruler” a 255-inch ruler with 256 points labeled 0–255, or is it a 256-inch ruler with 256 1-inch segments, making L = 256×1?
zephen 4 hours ago [-]
The analogy is pretty straightforward.
When you have a 12 inch ruler, you effectively have 13 numbers on the ruler. The fact that zero isn't marked is neither here nor there -- the numeral one is not at the far end of the ruler.
So if you extend the ruler to be as long as you can hold in eight bits, it will range from 0 to 255, and the total length will be 255.
The ruler analogy may seem overly simplistic, but then the real world is likewise fairly simplistic.
At the end of the day, the numbers presumably come from a sensor, or go to a display, and, often, in either case, zero represents as dark as you can get and 255 represents as light as you can get, so the physics dictate that the intervals associated with the 0 and 255 are half the size of the rest of the intervals.
Audio is more interesting than video, because in audio, you care deeply about not having an offset, and about having a balanced signal, so the question of whether the midpoint is actually on a number or not is pertinent.
In audio, it is often useful to simply discard a code so that 0 is the midpoint (e.g -65535 to +65535, discarding 0xFFFF). But this still gives you smaller intervals at both ends.
lacedeconstruct 14 hours ago [-]
yes but >> 8 is so much faster
xigoi 13 hours ago [-]
You don’t divide a float by 256 by shifting it right eight bits; that would yield complete garbage. You subtract 8 from the exponent, then check if you got an underflow.
dheera 12 hours ago [-]
Same point; divide by power of 2 is a fast subtraction operation in float world, while divide by 255 shits all over the whole float
yongjik 9 hours ago [-]
If your input is an arbitrary float, you need to check for denormals (and maybe NaNs). You can do bitmasking trick to avoid conditional jumps but I'm skeptical you can do it faster than SIMD multiply instruction.
StilesCrisis 13 hours ago [-]
It's just multiplication. Floating multiply is extraordinarily fast.
lacedeconstruct 13 hours ago [-]
The difference between 20 cycles and 1 clock cycle in a hot loop is very noticeable
exyi 12 hours ago [-]
It's 3 cycles for float multiplication (and 1 for shift right):
In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
userbinator 6 hours ago [-]
It's 3 cycles for float multiplication (and 1 for shift right):
3x faster
In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
50% faster
Tuna-Fish 12 hours ago [-]
FP Division by constant is optimized by a compiler into a multiply. Graphics processing typically happens on the GPU these days, and on all recent GPUs FPMUL belongs to the class of lowest-latency operations. That is, there are no other instructions that complete faster.
pixelesque 12 hours ago [-]
Only with things like -ffast-math enabled will compilers do the reciprocal.
It can make a fair difference in some cases, but it's often better to selectively use it in code locations you know are acceptable by doing it manually in the code.
mgaunard 12 hours ago [-]
That's only valid to do if the reciprocal is representable exactly.
hansvm 11 hours ago [-]
That's not totally true. It's sufficient to be exactly representable, but you only need the reciprocal rounding error to be small enough to guarantee the multiplication rounding step fixes it across the entire range of numerators. For IEEE754 f16 values, there are 28 such extra values, the positive and negative sides of 1705/x where x is a power of 2 at least as great as 2048.
Sesse__ 13 hours ago [-]
Useful, then, that you can start several vectorized floating-point muls each cycle. (E.g., most modern x86 are 3/0.5 cycles for vmulps. No 20 cycles in sight.)
dist-epoch 14 hours ago [-]
Only in micro-benchmarks.
For real usage, today's CPUs are limited by memory bandwidth.
lacedeconstruct 13 hours ago [-]
What are you talking about in a hot loop in my software renderer this is like 10x faster
And both are wrong since the values would have to be in a linear color space for for the compositing math to make sense. But in some non-linear space to be useful when mapped to 0..255 (e.g non-linear sRGB).
Which happens right after the Porter-Duff Over operator above -- a smoking gun. Which one is it gonna be?
I.e. the display transform is omitted from this and the math involved with the latter makes your whole argument moot.
It can't be expressed well enough with bitshifts to keep your purported 10x speedup anyway (and which I strongly doubt btw).
And lastly: in a software renderer that stuff is usually <0.01% of the compute in the absolut worst case.
P.S.: I'm speaking from 30 years of experience with software rendering in the context of VFX.
Tuna-Fish 12 hours ago [-]
If the latter is 10x faster, the issue is some kind of weird compilation failure for the above version. For one, it only cuts a third of the multiplies.
dist-epoch 13 hours ago [-]
Because you are working in the cache.
Also, you should use SIMD.
lacedeconstruct 13 hours ago [-]
> Also, you should use SIMD.
ironically no clang is better at auto vectorizing
szundi 13 hours ago [-]
[dead]
layer8 11 hours ago [-]
But who says that the numbers are representing the points, rather than representing the intervals between the points?
wky 11 hours ago [-]
It doesn't even need to represent intervals. A 13 inch ruler with 13 markings at 0.5, 1.5, etc inches is still a valid ruler, albeit an odd construction.
m463 8 hours ago [-]
the correct way is to use a slide rule
groundzeros2015 14 hours ago [-]
I’m dumb. Doesn’t 0 start at the beginning?
dylan604 11 hours ago [-]
It's right up there with the confusion if 2000 was the new year of the 21st century or the last year of the 19th century.
simonask 10 hours ago [-]
For the record, the mathematically correct answer to this question is that the year 2000 was the last year of the 19th century.
The reason is that year 0 never existed. The year 1 BCE was followed by the year 1 CE.
Culturally, anthropologically, and psychologically it might be a different matter. But 2000 years had not passed before the end of that year.
tshaddox 2 hours ago [-]
What makes this argument less compelling is that “year 1 AD” also didn’t exist at the time, and this isn’t a great reason to abandon the arithmetically sane approach of zero-indexed year numbering.
The calendar was back-dated 500 or so years after Jesus, by a European guy before Europe had the concept of zero, leaving us with 1-indexed years. Then, 200 or so years after that, another guy (still lacking the concept of zero) made the even less venerable decision that the year right before 1 AD would be 1 BC.
We could just decide today that 0 came right before 1 AD and was the first year of the first century AD. Then we’d just have to shift all BC dates by 1 year in all our history books.
The upside would be that arithmetic on year labels starts working again. The downside is that there are way too many history books and no one will ever do this.
Of course, the easier way out is to just decide today that either 1) the first century began in 1 BC or 2) the first century had 1 fewer year than all the other centuries.
tzot 10 hours ago [-]
The debate is if 2000 is the first year of the 21st century or the last year of the 20th century. (btw I agree with the latter)
dylan604 8 hours ago [-]
wow, yeah, that's quite the miss on my part.
2 hours ago [-]
dpark 9 hours ago [-]
The entire issue arises from the use of truncation, right? It guarantees that only an exact 1.0 could land in the 255 bin so the net effect is a reduction of 256 bins to 255 bins. (Using random numbers as shown also guarantees no 1.0.)
Why not scale to fill the available bins, though? i.e. trunc(result * 255.999)?
klodolph 9 hours ago [-]
> Why not scale to fill the available bins, though? i.e. trunc(result * 255.999)?
That’s half of the mid-riser staircase quantizer discussed in the article. (The other half is coming up with the reverse.)
(I would implement it as min(floor(x * 256), 255).)
dpark 8 hours ago [-]
Kind of. According to the article, the mid-riser vs mid-tread distinction correlates to where the 0.5 is applied in the transform. I’m proposing that there is no 0.5 applied at all. Instead of counteracting the compression of the truncation operation by adding a fixed offset, it multiplies by a scale factor.
Possibly my proposal doesn’t hold up to repeated transforms and operations. It might skew toward 255 in real operations.
klodolph 6 hours ago [-]
The 0.5 comes from debiasing the round-trip.
If your conversion from high precision -> 8-bit is just multiplication by 256 and then truncation, then you’ve got the mid-riser quantizer. The +0.5 comes from interpreting a value of 0 as bucket from 0-1, just like the value of 255 is the bucket from 255-256. It’s introduced in the conversion back from 8-bit to high precision.
Sesse__ 13 hours ago [-]
You should multiply by 255.0, optionally add a dither (triangular is okay), and then let the FPU round using its default IEEE 754 round-to-nearest-ties-to-nearest-even mode. None of this crazy 0.5 stuff. :-)
MyMemoryfails 6 hours ago [-]
As game dev, i never understood why floats are used to present colors? Isn't integers better? The issues which this article mentioned wouldn't exist.
I can only think its due integers having undefined behavior what happens on overflow, usually its wrapping but not always.
somat 5 hours ago [-]
It is because light(colors) is a fundamentally exponential process and floats are also a fundamentally exponential process and as such the two are a good match for one another.
sheept 5 hours ago [-]
Floats are better for calculating lighting. I would think some GPUs are probably also more optimized for float processing than ints.
orlp 2 hours ago [-]
When going from float to u8 you should add a triangular dither. It makes a world of difference for grayscale gradients, even in 24bit truecolor.
kazinator 10 hours ago [-]
No, the "alternative" approach looks strange in the 7 bit example.
1.0 lies on the right side of the bin 7. But 0.0 lies on the left of bin 0.
The standard approach assumes that we have centered samples: that zero is dead black, plus (and minus!) some uncertainty and so is bin 7.
If the sampling of the intensity is distortion-free (no clipping took place due to overexposure) then bin 7 represents a range of possible values centered around 1.0.
It is not a half-sized interval.
> This means that when converting floating-point values in the
[0,1] range back to integers, the extreme bins have effectively half the width of other bins.
Under any interpretation whatsoever of the image samples, there is latitude for interpreting the maximum value 255 as being distortion: clipping from an arbitrarily higher value. Shifting things by 0.5 doesn't fix this issue of not knowing whether 255 means that an intensity close to 1.0 is being represented (no distortion), or an outlier intensity of 37.49 (severely clamped). That could go the other way too.
In other words, there is a possible bias in the extreme bin. The signal could be limited such that the bin's full sampling range is not in effect, or the signal could be overwhelming, so that values far outside of the range are clipped and included.
The only way around this is to make the highest value a canary which represents "clipped value". That is to say, 255 means "clipped datum", so that only 254 and below is sampling of unclipped signal. Machine-generated image (e.g. 3D rendering) then avoid the 255 value, and camera sensors are calibrated so that it doesn't occur when technical images are being shot.
Retr0id 14 hours ago [-]
Both of these assume a linear transfer function, which is rarely the case.
leni536 13 hours ago [-]
Basically never for 8-bit color channels.
jessetemp 12 hours ago [-]
The author is confusing bins with bin edges. In their first plot, the standard approach looks strange because 0-7 should be the bin edges, not the center points as shown in the plot.
You can see this confusion again in the histogram example. There are only 255 bins, not 256. If you fix that mistake and remove the 0.5 offset, then the histogram is distributed correctly at both ends.
pornel 11 hours ago [-]
No, the author understands the problem way deeper than you do.
You haven't grasped the fact that the choice isn't obvious, and has subtle trade-offs.
If you don't believe the author, check the other posts he references.
jessetemp 5 hours ago [-]
Judging by your other comment in this thread, you might agree with my rational [1] more than you realize
2*8 = 256. You can represent 256 distinct values, bins, with an 8 bit number. If you stick a 0 in that first one, it takes a bin. If you fill the rest with by-one increasing integers, then the max value will be 255, thus the 2*bits - 1, which is the max value you can store.
bjourne 11 hours ago [-]
How do you fit 256 distinct values into 255 bins?
jessetemp 11 hours ago [-]
By counting the edges
jrmg 8 hours ago [-]
I see what you’re saying - index 0 holds values from 0-1, index 2 from 1-2 etc, but then you have index 255 holding values between 255 and 256. So you’re sort of arguing that the 0-255 8-bit quantization is actually representing ‘real’ values of 0-256?…
Edit: somehow missed alterom’s reply - they explain it much better than my question above does.
alterom 9 hours ago [-]
Sorry, you seem to be confused.
>There are only 255 bins, not 256
There are 256 bins because there are 256 values.
The questions are:
1. What are the boundaries of these bins?
2. Which sample represents a particular bin?
With 1-bit color, we have sample values {0, 1}. What bins do they represent?
Here's one choice:
[0, 1), [1, 2)
Two equally sized bins, spanning the interval [0, 2] of length 2, each defined by its sample at lower bound.
Alternatively, we could consider these bins:
[-0.5, 0.5), [0.5, 1.5)
These are also equally sized bins, spanning the interval [-0.5, 1.5] of length 2, defined by samples at the center.
We could also define bins like this:
[0, 0.5), [0.5, 1]
Two equally sized bins spanning the interval [0, 1] of length 1, where we sample the first bin at the lower bound, and the last bin at the upper bound.
This, in a nutshell is what the author is trying to explain.
Let's look at this again, with 2 bits.
With 2-bit color, we have sample values {0, 1, 2, 3}.
The first two span an interval of length 4, the third spans an interval of length 3.
In the third case, the tail bins are short (have size ½), and the rest have size 1.
The last bin must be a closed interval in the third case, so that it includes the value we picked to represent it.
None of these choices is inherently invalid or better than the others; and none stems from "confusing bins with edges".
The third option does have the distinction that the first and last bins are smaller than the rest. But it's not necessarily a drawback. Especially when we're talking about color, hardware interpretation, and human perception.
When you remap these bins into the [0, 1] interval, you're "dividing by 4" in the first two cases, and by 3 in the third case.
The maps are:
x → x/4
x → (x + ½)/4
x → x/3
The inverse maps (that yield a sample in {0, 1, 2, 3} given a floating point value in interval [0, 1]) are:
x → trunc(4x)
x → round(4x - ½) = trunc(4x)
x → trunc(3x + ½)
In the first two options, the domain is [0, 1). It might be necessary to apply clipping because the exact value 1.0 falls outside the range of the forward transform.
The 2nd option is the most symmetric, of course, but the 3rd one is the most straightforward (and cheapest) to implement, so that's the default.
The choice amounts to making the highest and lowest bins slightly smaller to make the rest sightly larger.
That's to say, if you generate uniform noise between 0 and 1, you'll get the following samples from your function with equal probability:
0 or 3
1
2
As the author points out, this hardly matters when you are talking about having 256 bins.
That, and with color specifically, the "good" histograms aren't uniform anyway (and any photographer wants to avoid getting much at either extreme).
TL;DR: The author is not confusing anything — but their diagram and explanation are, indeed, a bit confusing.
jessetemp 5 hours ago [-]
Thank you for the thoughtful reply. Maybe bins is the wrong word to use, so I'll try with intervals. Starting with 1 bit data, there are two numbers and one interval. I think where bins makes it confusing is that inside the interval there are two big rounding errors mapping everything to either 0 or 1 and many people seem to be considering those the bins.
Taking a step back, remember we're ultimately mapping these discrete numbers to some real world continuous variable like the saturation of red, frequency, mass on a scale, whatever. And our digital device can only represent a finite amount of numbers. For 2 bit data, we can represent 0-3, and for 3 bit data we can represent 0-7.
The important part is that 0 represents the minimum and 1,3, and 7 all represent the same maximum real value, and everything that can be measured by the device will fall within those ranges. So comparing 1, 2 and 3 bit data on a linear number line looks like this:
0 1
0 1 2 3
0 1 2 3 4 5 6 7
You could assume that everything gets assigned to whatever number is nearest in the number scale or come up with another scheme, but that is ultimately defined by the ADC and likely nonlinear. All we know is that those are the numbers we have available to represent the real values we're measuring.
The question is about how to normalize the data. 1 bit data is already normalized. If you normalize 2 bit data by 3 you get [0, 1/3, 2/3, 1]. LGTM. If you normalize it by 4, you get [0, 1/4, 2/4, 3/4] and you're effectively throwing away some of the range of the ADC. You can try to get it back by offsetting by 0.5 then normalizing but now you get [1/8, 3/8, 5/8, 7/8]. And you could stretch that with some clever formula to fill from 0 to 1, but if you do it right then it's the equivalent to normalizing by 3, so why not normalize by 3?
So the answer is, if you have N bit data, you normalize by 2^N-1.
Zardoz84 31 minutes ago [-]
0-255
Using 1-256 i find it weird
crazygringo 14 hours ago [-]
Advice for anyone on mobile: read in landscape mode if you want to be able to see the division by 256 version code example at the start.
The HTML/CSS is bad that lets it completely overflow the right edge of the page instead of wrapping.
I re-read this post three times in total confusion before I figured out the most important piece was off-screen entirely.
2001zhaozhao 8 hours ago [-]
There are only two real solutions after factoring in the need to preserve black as zero.
They are "rgb / 255.0" vs. "rgb / 256.0". Both have different tradeoffs. Pick your poison. (If you're using a 8 bit display signal then you better match whatever value the OS picked for the mapping back to the display, so your RGB values pass through unchanged)
JamesTRexx 2 hours ago [-]
Why not (uint8_t) ( float * (256/255) ) * 255?
theyeenzbeanz 14 hours ago [-]
Should always be 0-255 as that fits an unsigned byte.
crazygringo 14 hours ago [-]
That's not what the article is about.
Retr0id 14 hours ago [-]
> assume that in both cases the output values are clamped before the final typecast
atilimcetin 13 hours ago [-]
Interesting article. I tend to use
- i = min(floor(f * 256), 255) (from float to uint8)
- f = i / 255 (from uint8 to float)
Basically a mix of the 2 approaches mentioned in the article.
For all integers between [0,255], if I do uint8 -> float -> uint8 conversion, I will get the same result.
--
edit: I wondered what's the maximum jitter amount that I can introduce to the float and get the same uint8 value. And also these 0->0.0 and 255->1.0 should map properly.
With my approach at the top, the jitter margin that I can introduce is 1/65280.
But with the article's approach
- i = floor(f * 255 + 0.5)
- f = i / 255
maximum jitter margin is 1/510 (which is better).
AgentME 12 hours ago [-]
It's worth pointing out that the article explicitly calls out your first mixed technique:
> Finally, one should never mix the encode and decode steps of the two quantizers. That’s just broken code. It’s an easy mistake to make, though.
vitorsr 13 hours ago [-]
This is what I do for the former:
floor( nextafter( 256, 255 ) * value )
atilimcetin 13 hours ago [-]
Oh very nice idea to get rid of the min operator.
RobRivera 10 hours ago [-]
Are we talking 0 or 1 based values? HONKHONK*
wyager 9 hours ago [-]
You don't need to make this judgement; it's fixed by the colorspace you're working in.
First, figure out what colorspace the processing needs to happen in. Usually this is linear RGB.
Then, figure out what OETF and EOTF your input/output format use. This will be something like PQ or HLG. This will exactly specify the meaning of each integer value.
This fixes the choice of representation and conversion.
AlienRobot 10 hours ago [-]
Case against 255: it looks wrong in the graph :(
Case against 256: no 0 or 1 values :(
Considering how important having a 0 and 1 value is for arithmetic in general, I think 255 is better.
dist-epoch 14 hours ago [-]
A similar issue exists in the audio world, for example 16-bit integer audio is between [-32768, 32767] (non-symmetric), but floating point audio is [-1.0, 1.0].
ack_complete 10 hours ago [-]
There is an analogous situation in graphics with signed normalized formats. The solution there is that the R16_SNORM format maps -1 to +1 as [-32767, 32767] with -32768 being a special value (not normally emitted, and mostly but not always interpreted as -32767). Some audio storage formats seem to use this mapping too.
adzm 13 hours ago [-]
note that floating point audio very often exceeds [-1.0, 1.0] within the pipeline, just to be tamed at the very end of the mix to fit within those bounds. this is pretty much why every modern DAW uses floating point these days.
Natalia724 6 minutes ago [-]
[flagged]
jt_park 4 hours ago [-]
[flagged]
corysama 11 hours ago [-]
[dead]
davidladdsource 9 hours ago [-]
[flagged]
ctdinjeu8 13 hours ago [-]
Both. 255 for each color and the last 1 as the alpha for each channel.
Why not??? Fight me
DigitallyFidget 14 hours ago [-]
255 gives 0-255, which gives you a zero value. 256 is 1-256, you lose the option of setting 0.
crazygringo 14 hours ago [-]
That's not what the article is about.
dgently7 13 hours ago [-]
"Let’s say you’re writing an image processing program. The program takes in an image, converts it to floating point, does some processing and finally saves the modified pixels to disk as 8-bit colors. "
excuse to argue about the best way aside, if this is the goal you should not be rolling your own image file reading. you should use openimageio. idk what approach it takes in its internal conversion to float, but that library is more likely to have the right answer than you trying to roll it yourself given its the library used internally by tons of professional image manipulation software...
pixelesque 13 hours ago [-]
If you're a beginner, or just want something which works quickly, sure.
However OIIO is far from perfect in all situations (having had to debug and fix issues with its mip-map generation filtering code in the past), so don't always assume that just because there's a mature open source library out there doing something that it's always perfect.
dgently7 12 hours ago [-]
sure of course nothing is perfect and oiio has a lot of surface area / is still oss. thats good advice.
ive just seen a lot of "ai researchers" who are getting into professional image processing and are both beginners and want things quickly and so could do much worse than just starting from what they get out of oiio. especially for a lot of the non-obvious stuff (more of that in color handling than just the io stuff though)
And when you go from float to 8bit you should dither to avoid banding.
If in doubt, error diffusion with a random number between -0.5..=0.5 is fine. 0.5 here is dither_amplitude:
round(255 * input_value + dither_amplitude * random(-1, 1))
See e.g. my dithereens crate: https://crates.io/crates/dithereens
In a scientific computing setting it would be insane to start doing data processing without knowing how to interpret the values. In the context of audio signal processing, if you just get a stream of integers, you'd have to know the representational intent of those integers (mu-law encoding or linear?) if you're going to compute anything about the underlying signal. The meta-data accompanying the values would hopefully provide the answer.
But with 8-bit pixel values, absent any meta-data from a competent file format that can communicate the representational intent, we're adrift and there's no right answer (like the author says). Certainly no one can fault you for picking whichever one seems to be give better results for your application, but you can raise awareness that bits without context have had their meaning undermined.
It becomes a pain in the ass when you're generating a VGA signal with a microcontroller with 8 color output pins (3 red, 3 green, 2 blue). The meaning of a color value is very real in this setup: it corresponds to a voltage level you must send to the VGA monitor, 0V-0.7V.
So the blue channel will map (0->0V, 1->0.23V, 2->0.47V, 3->0.7V), and the red/green will map (0->0V, 1->0.1V, ..., 7->0.7V). Notice how none of the blue voltages match any of the red/green ones (other than the extremes)? That means you don't get to see any pure grays -- the closest ones will have bit of blue or yellow tint, depending on the direction of the difference.
Not only that, any gradients at all (other than the ones not mixing blue with the other channels) will be noticeable off: for example, the closest colors in the line between pure red to pure white will all be slightly orange or purple.
Code for VGA output in 8-bit color with double-buffered 320x240 framebuffer for the Raspberry Pi Pico 2 here, if anyone cares: https://github.com/moefh/pico-vga-8bit-demo
OMG I remember as a kid staring at static-y CRT displays, and seeing these faint blue and yellow lines at the borders of them. I’d always wondered why they appeared and why they were specifically blue and yellow. I finally know! (at least, assuming those specific artifacts are due to the same thing)
2^2.2 = 4.595, 255^2.2 = 196,964.699
(This may be more apparent when you frame gamma as being applied in the 0-1 range, so it doesn’t really turn 2 into 4.595 and 255 into ~200k; it turns (2/255)≈0.00784 into (2/255)^2.2 ≈ 0.0000233, and leaves (255/255)=1 as is.)
Changing at 30Hz I doubt a human can tell the difference between slightly blue and slightly yellow.
I assume this is why RGBI color was so common in the 80s.
Coming from an electrical engineering background, I disagree with how the author presented "Two types of quantizers". Mathematically rigorous, but not grounded in practical systems.
In ADCs, there is always an inherent +-1/2 LSB of quantisation uncertainty. The transfer characteristic is always mid-tread sampling, or at least I haven't come across any counter examples. This is true for bipolar or unipolar ADCs.
The lowest code is negative voltage reference, and the highest code the positive reference. The transfer characteristic plot will show what the author has demonstrated, that the highest and lowest bins will effectively be 1/2LSB in width.
In a unipolar system, this has the consequence of not being able to represent the midpoint voltage precisely, or in other words, the gray problem. In a bipolar system, 0V will be mid-tread N/2 value, but that doesn't mean it has "256 ranges".
So, I'll be sticking with (VREF+ - VREV-) * k / (2^N - 1). Or in other words I agree with the normalisation by 255. It's the fence post error all over again, you have N values, but N-1 ranges. If you have less ranges than you do values, you need to distribute 1 of those ranges between two values, hence the 1/2LSB range endpoints.
But if you actually care about what the voltage (and uncertainty) is, most of this is difference is mostly pointless, you're reference will have a bias, there are linearity errors and so on. 1 LSB will not match either the 1/128 or 2/255, you will need parameters to compensate for it.
This is the same kind confusion that happens with sampling positions in modern APIs, where the location is specified in coordinates and not in pixel centers.
RGB values represent luminances against some adapted state, and a "zero" in a daylit scene is not "zero luminance" - it's just about 0.001x as bright as the brightest point - it's millions of photons, way more than zero. In a sense our eyes experience contrast on a sliding scale, and there is no absolute zero in the system. For example, broadcast systems historically used 16-235 as their luminance range for SDR. I think any argument that says "we must have zero" is going to have a bias, but I don't think zero is needed for most things.
Also, a lot of workflows for image processing and compositing do assume that 0 means zero, whether correctly or not (often incorrectly). So there are often assumptions that for 8-bit, 0u maps to 0.0f and 255 maps to 1.0f for things like masking or alpha: as soon as you have 0 values which become just over 0.0, you then have artifacts because some code somewhere is using a hard threshold of 0.0 to mask some other operation, and vice-versa for 1.0 with alpha, where suddenly because the 255 values are no longer 1.0f, you have very slightly see-through objects (often only visible in certain situations or when pixel-peeping) after pre-multiplication.
(Same thing can happen when 254 becomes 1.0f after +0.5 with masking).
The argument for 0-256 feels compelling when thinking about the physical display, but it seems like a very poor fit for any digital image processing or rendering.
The issue isn't in having a representation for 0 photons, but about maximizing information stored in a byte. Ideally you shouldn't be underutilizing the byte value 0, nor add bias to data that should have been assigned to the 0th bucket, regardless of what it represents (you could have a color space that goes from bright to super bright, and still want to ensure that every byte represents equal chunk of your brightness range).
Now I am imagining a weird alternate history where we treat audio like we treat color. OK take three bytes which encode how loud the sound is, one for lows, one for mids and one for highs where lows mids and high frequencies are picked to match human ear response.
For 8-bit, 16 maps to 7.5IRE which is the well understood legal black. Mapping 235 means they mapped peak to 110IRE. This is based on a 0-120IRE scale. This gets weird as the broadcast limit for video was 100IRE allowing for the chroma to reach 110IRE. So if you're trying to limit your white values to 235, that'll be higher than is broadcast safe. Of course, nobody cares about NTSC broadcast limits any more. However, to this day, I still see out of spec tapes marked as "broadcast master" that have been ingested for streaming use. It drives me crazy to this day, and it's only getting worse as people don't even have scopes to adjust the VTR's TBC properly.
Generally no -- in an 8-bit NTSC-M Rec. 601 system, 16 maps to E'Y = 0 at 7.5 IRE, and 235 maps to E'Y = 1 at 100 IRE. See https://www.poynton.ca/pdf/Poynton-1996-TechIntrDigiVide.pdf
The "16" digital black level is independent of the "7.5 IRE" analog setup. E.g. in Japan with an 8-bit "NTSC-J" Rec. 601 system, my understanding is that 16 still maps to E'Y = 0 which is now at 0 IRE, and 235 is still E'Y = 1 at 100 IRE.
But IIRC the MPEG-2 standard had luma==235 -> 100IRE for all of the analog formats (pal/ntsc-j/ntsc/secam) so I'm not sure why you say that would violate the broadcast limits?
There's a whole visual center to check the amount of incoming light and adjust your pupils for you. It's intentionally reactive.
> and there is no absolute zero in the system.
There maybe is. I think we call that "blind."
> broadcast systems historically used 16-235 as their luminance range for SDR
Mostly because it was a fully analog system and these all translate down to signal voltage. Jokingly NTSC used to be referred to as "Never Twice the Same Color" due to being a compromise bolted onto the side of an already compromised system.
> There maybe is. I think we call that "blind."
If you go looking into that, you'll see that the reality is far far more complex [0]
"The number of people with no light perception is unknown, but it is estimated to be less than 10 percent of totally blind individuals."
[0] https://chicagolighthouse.org/sandys-view/what-blind-people-...
"While in theory there are cases where you might want to use either type of quantization, if you are in games don't do that!
The reason is that the GPU standard for UNORM colors has chosen "centered" quantization, so you should do that too."
I also enjoyed the 2002 article by Jonathan Blow [1] that's linked at the bottom. The visualization from the first article helped a lot once this started to go more in-depth.
[1] https://web.archive.org/web/20240706043551/https://number-no...
Someday I'll finish... :-(
When you have a 12 inch ruler, you effectively have 13 numbers on the ruler. The fact that zero isn't marked is neither here nor there -- the numeral one is not at the far end of the ruler.
So if you extend the ruler to be as long as you can hold in eight bits, it will range from 0 to 255, and the total length will be 255.
The ruler analogy may seem overly simplistic, but then the real world is likewise fairly simplistic.
At the end of the day, the numbers presumably come from a sensor, or go to a display, and, often, in either case, zero represents as dark as you can get and 255 represents as light as you can get, so the physics dictate that the intervals associated with the 0 and 255 are half the size of the rest of the intervals.
Audio is more interesting than video, because in audio, you care deeply about not having an offset, and about having a balanced signal, so the question of whether the midpoint is actually on a number or not is pertinent.
In audio, it is often useful to simply discard a code so that 0 is the midpoint (e.g -65535 to +65535, discarding 0xFFFF). But this still gives you smaller intervals at both ends.
https://uops.info/table.html?search=mulss&cb_lat=on&cb_tp=on...
https://uops.info/table.html?search=shr&cb_lat=on&cb_tp=on&c...
In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
3x faster
In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
50% faster
For real usage, today's CPUs are limited by memory bandwidth.
Which happens right after the Porter-Duff Over operator above -- a smoking gun. Which one is it gonna be?
I.e. the display transform is omitted from this and the math involved with the latter makes your whole argument moot.
It can't be expressed well enough with bitshifts to keep your purported 10x speedup anyway (and which I strongly doubt btw).
And lastly: in a software renderer that stuff is usually <0.01% of the compute in the absolut worst case.
P.S.: I'm speaking from 30 years of experience with software rendering in the context of VFX.
Also, you should use SIMD.
The reason is that year 0 never existed. The year 1 BCE was followed by the year 1 CE.
Culturally, anthropologically, and psychologically it might be a different matter. But 2000 years had not passed before the end of that year.
The calendar was back-dated 500 or so years after Jesus, by a European guy before Europe had the concept of zero, leaving us with 1-indexed years. Then, 200 or so years after that, another guy (still lacking the concept of zero) made the even less venerable decision that the year right before 1 AD would be 1 BC.
We could just decide today that 0 came right before 1 AD and was the first year of the first century AD. Then we’d just have to shift all BC dates by 1 year in all our history books.
The upside would be that arithmetic on year labels starts working again. The downside is that there are way too many history books and no one will ever do this.
Of course, the easier way out is to just decide today that either 1) the first century began in 1 BC or 2) the first century had 1 fewer year than all the other centuries.
Why not scale to fill the available bins, though? i.e. trunc(result * 255.999)?
That’s half of the mid-riser staircase quantizer discussed in the article. (The other half is coming up with the reverse.)
(I would implement it as min(floor(x * 256), 255).)
Possibly my proposal doesn’t hold up to repeated transforms and operations. It might skew toward 255 in real operations.
If your conversion from high precision -> 8-bit is just multiplication by 256 and then truncation, then you’ve got the mid-riser quantizer. The +0.5 comes from interpreting a value of 0 as bucket from 0-1, just like the value of 255 is the bucket from 255-256. It’s introduced in the conversion back from 8-bit to high precision.
I can only think its due integers having undefined behavior what happens on overflow, usually its wrapping but not always.
1.0 lies on the right side of the bin 7. But 0.0 lies on the left of bin 0.
The standard approach assumes that we have centered samples: that zero is dead black, plus (and minus!) some uncertainty and so is bin 7.
If the sampling of the intensity is distortion-free (no clipping took place due to overexposure) then bin 7 represents a range of possible values centered around 1.0.
It is not a half-sized interval.
> This means that when converting floating-point values in the [0,1] range back to integers, the extreme bins have effectively half the width of other bins.
Under any interpretation whatsoever of the image samples, there is latitude for interpreting the maximum value 255 as being distortion: clipping from an arbitrarily higher value. Shifting things by 0.5 doesn't fix this issue of not knowing whether 255 means that an intensity close to 1.0 is being represented (no distortion), or an outlier intensity of 37.49 (severely clamped). That could go the other way too.
In other words, there is a possible bias in the extreme bin. The signal could be limited such that the bin's full sampling range is not in effect, or the signal could be overwhelming, so that values far outside of the range are clipped and included.
The only way around this is to make the highest value a canary which represents "clipped value". That is to say, 255 means "clipped datum", so that only 254 and below is sampling of unclipped signal. Machine-generated image (e.g. 3D rendering) then avoid the 255 value, and camera sensors are calibrated so that it doesn't occur when technical images are being shot.
You can see this confusion again in the histogram example. There are only 255 bins, not 256. If you fix that mistake and remove the 0.5 offset, then the histogram is distributed correctly at both ends.
You haven't grasped the fact that the choice isn't obvious, and has subtle trade-offs.
If you don't believe the author, check the other posts he references.
[1] https://news.ycombinator.com/item?id=48365800
Edit: somehow missed alterom’s reply - they explain it much better than my question above does.
>There are only 255 bins, not 256
There are 256 bins because there are 256 values.
The questions are:
1. What are the boundaries of these bins?
2. Which sample represents a particular bin?
With 1-bit color, we have sample values {0, 1}. What bins do they represent?
Here's one choice:
Two equally sized bins, spanning the interval [0, 2] of length 2, each defined by its sample at lower bound.Alternatively, we could consider these bins:
These are also equally sized bins, spanning the interval [-0.5, 1.5] of length 2, defined by samples at the center.We could also define bins like this:
Two equally sized bins spanning the interval [0, 1] of length 1, where we sample the first bin at the lower bound, and the last bin at the upper bound.This, in a nutshell is what the author is trying to explain.
Let's look at this again, with 2 bits.
With 2-bit color, we have sample values {0, 1, 2, 3}.
Which bins do they come from?
The three options above yield:
The first two span an interval of length 4, the third spans an interval of length 3.In the third case, the tail bins are short (have size ½), and the rest have size 1.
The last bin must be a closed interval in the third case, so that it includes the value we picked to represent it.
None of these choices is inherently invalid or better than the others; and none stems from "confusing bins with edges".
The third option does have the distinction that the first and last bins are smaller than the rest. But it's not necessarily a drawback. Especially when we're talking about color, hardware interpretation, and human perception.
When you remap these bins into the [0, 1] interval, you're "dividing by 4" in the first two cases, and by 3 in the third case.
The maps are:
The inverse maps (that yield a sample in {0, 1, 2, 3} given a floating point value in interval [0, 1]) are: In the first two options, the domain is [0, 1). It might be necessary to apply clipping because the exact value 1.0 falls outside the range of the forward transform.The 2nd option is the most symmetric, of course, but the 3rd one is the most straightforward (and cheapest) to implement, so that's the default.
The choice amounts to making the highest and lowest bins slightly smaller to make the rest sightly larger.
That's to say, if you generate uniform noise between 0 and 1, you'll get the following samples from your function with equal probability:
As the author points out, this hardly matters when you are talking about having 256 bins.That, and with color specifically, the "good" histograms aren't uniform anyway (and any photographer wants to avoid getting much at either extreme).
TL;DR: The author is not confusing anything — but their diagram and explanation are, indeed, a bit confusing.
Taking a step back, remember we're ultimately mapping these discrete numbers to some real world continuous variable like the saturation of red, frequency, mass on a scale, whatever. And our digital device can only represent a finite amount of numbers. For 2 bit data, we can represent 0-3, and for 3 bit data we can represent 0-7.
The important part is that 0 represents the minimum and 1,3, and 7 all represent the same maximum real value, and everything that can be measured by the device will fall within those ranges. So comparing 1, 2 and 3 bit data on a linear number line looks like this:
You could assume that everything gets assigned to whatever number is nearest in the number scale or come up with another scheme, but that is ultimately defined by the ADC and likely nonlinear. All we know is that those are the numbers we have available to represent the real values we're measuring.The question is about how to normalize the data. 1 bit data is already normalized. If you normalize 2 bit data by 3 you get [0, 1/3, 2/3, 1]. LGTM. If you normalize it by 4, you get [0, 1/4, 2/4, 3/4] and you're effectively throwing away some of the range of the ADC. You can try to get it back by offsetting by 0.5 then normalizing but now you get [1/8, 3/8, 5/8, 7/8]. And you could stretch that with some clever formula to fill from 0 to 1, but if you do it right then it's the equivalent to normalizing by 3, so why not normalize by 3?
So the answer is, if you have N bit data, you normalize by 2^N-1.
Using 1-256 i find it weird
The HTML/CSS is bad that lets it completely overflow the right edge of the page instead of wrapping.
I re-read this post three times in total confusion before I figured out the most important piece was off-screen entirely.
They are "rgb / 255.0" vs. "rgb / 256.0". Both have different tradeoffs. Pick your poison. (If you're using a 8 bit display signal then you better match whatever value the OS picked for the mapping back to the display, so your RGB values pass through unchanged)
- i = min(floor(f * 256), 255) (from float to uint8)
- f = i / 255 (from uint8 to float)
Basically a mix of the 2 approaches mentioned in the article.
For all integers between [0,255], if I do uint8 -> float -> uint8 conversion, I will get the same result.
--
edit: I wondered what's the maximum jitter amount that I can introduce to the float and get the same uint8 value. And also these 0->0.0 and 255->1.0 should map properly.
With my approach at the top, the jitter margin that I can introduce is 1/65280.
But with the article's approach
- i = floor(f * 255 + 0.5)
- f = i / 255
maximum jitter margin is 1/510 (which is better).
> Finally, one should never mix the encode and decode steps of the two quantizers. That’s just broken code. It’s an easy mistake to make, though.
First, figure out what colorspace the processing needs to happen in. Usually this is linear RGB.
Then, figure out what OETF and EOTF your input/output format use. This will be something like PQ or HLG. This will exactly specify the meaning of each integer value.
This fixes the choice of representation and conversion.
Case against 256: no 0 or 1 values :(
Considering how important having a 0 and 1 value is for arithmetic in general, I think 255 is better.
Why not??? Fight me
excuse to argue about the best way aside, if this is the goal you should not be rolling your own image file reading. you should use openimageio. idk what approach it takes in its internal conversion to float, but that library is more likely to have the right answer than you trying to roll it yourself given its the library used internally by tons of professional image manipulation software...
However OIIO is far from perfect in all situations (having had to debug and fix issues with its mip-map generation filtering code in the past), so don't always assume that just because there's a mature open source library out there doing something that it's always perfect.
ive just seen a lot of "ai researchers" who are getting into professional image processing and are both beginners and want things quickly and so could do much worse than just starting from what they get out of oiio. especially for a lot of the non-obvious stuff (more of that in color handling than just the io stuff though)