f16
Systems in Rust
Announcements
- Enrichment assignment
- Implementation of half precision IEEE 754 floating point values
- Use bitwise operations
- Use numeric operations
- Combine them.
Homework
- SHA beckons
- Due Friday, 10 Oct. at 1440 ET.
- Do not work on this instead of that, after lab section today, until you are done with that.
Citation
- Basically I try to take Python.Numpy.float16 as the reference implementation, but break from Python on float16 the same places
f32
breaks fromnumpy.float32
.
Today
Struct
- Before anything else, have to find a way to make an
f16
.- We will just use a
u16
and treat it differently.
- We will just use a
- We will make a custom data structure that contains precisely a
u16
- This probably goes in
lib.rs
- Hence the
pub
- Hence the
- You can test that you created a type as follows:
- The first
f16
is the crate name, the second is the type name.
Aside
- For my money,
f16::f16
looks terrible.
Aside
- Rust will admonish you for using
f16
instead ofF16
.
warning: type `f16` should have an upper camel case name
--> src/lib.rs:1:12
|
1 | pub struct f16 {
| ^^^ help: convert the identifier to upper camel case (notice the capitalization): `F16`
|
= note: `#[warn(non_camel_case_types)]` on by default
- Everyone is a critic!
Aside
- We aren’t really all the way to teaching structs yet, but feel free to play around with them.
- In particular, you want to have 3 fields,
sign
,exp
, andmantissa
, rather than a singlebits
field. - The internal implementation is up to you with one exception:
- Only use integer operations to emulate floating point operations.
- (That’s what we’re trying to learn how to do here).
- You can read more on
struct
here: Rust Book 05.01
Input/Output
Versus built-in types, we’ll just provide a ways to get values in and out.
We’ll use the Rust default integer,
i32
.Write the following functions:
fn i32_to_f16(n:i32) -> f16
fn print_f16(x:f16)
// optional, but i used it - print_f16 and a newline.
fn println_f16(x:f16)
- You may implement more complex handling, or set every overflow to
inf
and underflow to0
. - The following table should be helpful, from the lecture:
Format | Sign | Exp. | Mant. | Bits | Bias | Prec. |
---|---|---|---|---|---|---|
Half | 1 | 5 | 10 | 16 | 15 | 11 |
Single | 1 | 8 | 23 | 32 | 127 | 24 |
Double | 1 | 11 | 52 | 64 | 1023 | 53 |
Quad | 1 | 15 | 112 | 128 | 16383 | 113 |
- Here is an example:
1.512*2^3
[src/main.rs:4:5] println_f16(i32_to_f16(12)) = ()
1.944*2^6
[src/main.rs:5:5] println_f16(i32_to_f16(123)) = ()
1.210*2^10
[src/main.rs:6:5] println_f16(i32_to_f16(1234)) = ()
- Not the debug stating logging the function call prints after the value is displayed, as the debug statement is only printed after the function call returns.
Aside
- It is admittedly odd to print values in decimal multpiplied by a power of 2.
- If you which to print in binary, simply prefix with
0b
0b1.1000000000*10^11
[src/main.rs:4:5] println_f16(i32_to_f16(12)) = ()
0b1.1110110000*10^110
[src/main.rs:5:5] println_f16(i32_to_f16(123)) = ()
0b1.11010010*10^1010
[src/main.rs:6:5] println_f16(i32_to_f16(1234)) = ()
- I didn’t find this easier (I just yeeted the output into Ye Olde Python III) but you may.
Aside
- Speaking of yeeting, if you prefer this formulation:
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/bits`
1.512*2**3
1.944*2**6
1.210*2**10
- I won’t say anything if you don’t.
Add/sub
- Write the following functions:
Mul/div
- Write the following functions:
Today
Optional
Implement infinities
- Ensure when any value is divided by zero, or any arithmetic results overflows, that the value is set to infinity.
- Ensure infinity is printed consistently, perhaps as
inf
or as unicode ∞. - Ensure positive and negative infinity.
- Ensure infinity matches IEEE 754 requirements:
- Sign: Either.
- Exponent: All
1
s. - Mantissa: All
0
s.
Implement NaN
- Ensure when infinities of differing signs are summed (whether via add or sub),
NaN
is returned. - Ensure when zero is divided by zero or
inf
byinf
returnsNaN
ornan
(up to you). - Ensure any operations over
NaN
and any other value returnNaN
. - Ensure
NaN
is printed consistently. - Ensure
NaN
matches IEEE 754 requirements:- Sign: Either.
- Exponent: All
1
s. - Mantissa: Anyting but all
0
s.
See also
Floating point values like f16 see use across a range of disciplines, and each of these has a major result named after the f16, more or less:
![]() |
![]() |
|
![]() |