Fewest radix-10 digits needed to hold any given radix-2 number

Was trying to work out a function to map the size, in base-2 bits of a number, to the minimum number of base-10 digits needed to represent the number (and to useful: if the function has an error the error must be upward). I couldn’t work out anything, so I wrote out the numbers to see if there was any obvious pattern. And hey presto:

bits Min base-10
0 1 2 3 1
4 5 6 2
7 8 9 3
10 11 12 13 4
14 15 16 5
17 18 19 6
20 21 22 23 7
24 25 26 8
27 28 29 9
30 31 32 33 10
34 35 36 11
37 38 39 12
40 41 42 43 13
44 45 46 14
47 48 49 15
50 51 52 53 16
etc..

A sufficient function must then be, using C-ish notation, and assuming ‘/’ is an integer divide:

f(x) -> (x%10 <= 3 ? (3x / 10 + 1)

: (x % 10 <= 6 ? (3x / 10 + 2)

: (3x / 10 + 3))

Update: f(x) -> (x % 10 <= 6 ? (3x / 10 + 2 : (3x / 10 + 3))

What I’m wondering though is:

1. Was there a better way to work out the function?
2. Is there a neater way of expressing it?

Update: As cjb points out:

2 = 10^n

=> n = log(2)/log(10)

=> 2^p = (10^n)^p

=> 2^p = 10^(np)

ergo, for any given p,  ceiling(np) digits are required

As I’m a bit dense, cjb had to re-explain it to me in terms of information content. To paraphrase (hopefully not too badly):

1 digit of base-10 has log2(10) bits of information, and similarly 1 digit of base-2 has log10(2) (ie 1/log2(10)) decimal units (aka ‘hartley‘) of information. Therefore, n binary bits have nlog10(2) hartley’s of information. Round this up to get the minimum required decimal digits.

cjb also pointed out a transcription error in the function (left out the multiply by 3). He has some other interesting points I’m still trying to get my head around :).

Update2: CJB points out there’s a problem with my table, and hence with my function. The below again paraphrases his explanation. My table assumes that for every additional 10 bits, that this corresponds perfectly with an additional 3 decimal digits (hence the recurring pattern). I.e. that the ratio is 10:3. However, this isn’t the case – as we can see from above, the ratio is actually:

1 : log10(2) = 3.322

This is obviously quite close to, but not the same as the 10:3 ratio in the table . Therefore the table diverges very slowly from the actual answer. To find where it diverges sufficiently to throw off the table:

the divergence ratio is 1024:1000, for every 3 rows.

so the divergence reaches 1 at: 3/log(1.024) ~= 291.2

so the columns of the table become out of sync around 291.

And lo and behold:

table:

292*3/10+1 = 88

293*3/10+1 = 88

by nlog2(p):

292*l(2)/l(10) = 87.90

293*l(2)/l(10) = 88.20 (oops, too big for 88)

Discrepencies actually occur between the 2 methods prior to this, at 103, 113, 123, … I don’t have an explanation for those as yet. Just removing the <= 3 band, and having those values use  the <=6/+2 band allows the integer-only method to work up to about 1073 bits.

Cjb has a better method, more precise and robust and integer only:

f(x) = 30103x/100000 + 1;

which is good up to about 13300 bits!

How to make it easy for a maintainer to apply your patch

• If you don’t know git, take a day or two to learn (it’s not that hard, and the gitk and git-gui GUI tools can help a lot)
• Read through the existing commit messages to get a feel for the common style – and try use this style in your own commit messages:
• A one-line summary as the 1st line of the commit message. You may wish to start it with ‘applicability:’, e.g. ‘bgpd:’, ‘netlink:’, ‘freebsd:’, etc. This is used for release announcements.
• The body of the commit message, containing a high-level description of the goal of the changes, including the problem being addressed and the reasoning for the commit, on a change by change basis (e.g. per file, even per-function, if that’s appropriate). For a large commit, an introductory overview may be needed.The goal is two-fold: To force the submitter to perform a self-review of the commit (and so catch problems); and to give the reader the required context to review the commit.
• An honest indication of the confidence you have in the patch, including what, if any testing has been done.
• If the patch contains work done by others, e.g. because it is a rework of some other patch, indicate this. Though, ideally, the original patch should be a separate commit, attributed correctly with the –author argument to git-commit.
• Read the HACKING document (parts are out of date though)
• Prepare a branch, based on the Quagga.net master, that contains your changes and nothing else (merges from master are ok; if you need to merge it together with other things for your own purposes, then use another branch – they’re cheap).
• If you like to rebase published branches, please name-space such branches with a ‘volatile/’ prefix.
• Order the commits so the that the safest, least-contraversial commits are earlier in the branch than patches that are less likely to be mergeable. This allows maintainers to, perhaps, still be able to pull part of your branch if some patches are deemed problematic.
• If you have a large and/or less-obviously-safe patch, try to split the patch up into smaller ones. E.g. there may be infrastructural improvements that can be factored out and go in first.

It’s extremely rare that contributors do all of the above. Commit messages in particular are often free-form. Sometimes they describe the problem, but not why the change fixes it. Sometimes the change is described, but without reference to the problem. Sometimes the message is akin to a C-to-english transliteration.

It might seem like a pain to force submitters to spend 5 to 10 minutes to go through their patch and construct a structured, detailed, but high-level description of their patch and their intent. However not doing so forces the reviewer to spend even more time to mentally construct the same. If you have many contributors feeding patches to a small number of maintainers, then it just doesn’t scale unless contributors make an effort to make their patches as reviewable and easy-to-integrate as possible.

In short, try to make life easy for your maintainer. There are several small, maybe even apparently trivial, things you can do quite easily which help reduce their load, and help smooth your patches’ path.

Updates: Git strips out []’s, so change section recommending patch subjects encode information in them.