Jeff Epler's blog

6 July 2023, 19:04 UTC

NYT "Letter Boxed" solver in C++


My wife and I regularly solve the NYT crossword in the app.

Lately it's been trying to get me to solve a puzzle called "Letter Boxed".

In this puzzle, there are 12 letters arranged with 3 on each side of a square. Valid words have to be made from the letters, with the additional constraint that two consecutive letters may not be on the same side of the square. For example if the edges are "xoy", "tws", "apv" and "kri" then "air" is not a valid word because the consecutive letters "ir" come from the same side. "vow" is. The last constraint is that each consecutive word starts with the last letter of the previous word, so "vow" can be followed by "wok" but not "sat". (Words of any length are permitted, not just 3 letters)

There is no particular scoring to the game, though you're suggested to "try to solve it in # words"; you can also view the previous day's suggested solution.

To me, it seems that the best answer is in the fewest words, with ties broken by the fewest number of characters.

The following program (which assumes that there's a standard unix-style dictionary at /usr/share/dict/words) can find what appear to be "optimal" solutions in only a few milliseconds. Simply supply it with the 12 letters as the first commandline argument, and it will perform a breadth-first search (BFS) with up to 5 words in length. Each candidate printed is better scoring (shorter) than the previous one, so the last line is the best score.

$ ./a.out xoytwsavpkri
228 candidate words
proviso - oaks - sixty - yaw
provisos - sixty - yaw - wk
vow - warps - sixty - yak
Checked 3027570 sequences

I originally wrote a program in Python, but memory usage was high and speed was low. This version uses an efficient structure where each word is reduced to a 32-bit quantity that tracks the letters present, the word length, and the word's terminal character. A particular game play is characterized by a fixed-size data structure that includes the characters used so far, the number of words, the number of total characters, and an array of up to 5 words. The deque is initialized with one entry for each possible word. In the main loop, the item is taken from the front of the deque. Then, based on the terminal letter of the last word played it tries each word starting with that letter. If this potential solution is not lower-scoring than the best one, then it stops evaluating. Otherwise, if this word completes the puzzle, the solution is printed and the best known score values are updated. Otherwise, this puzzle state is added to the end of the deque. The program loops until all possibilities have been evaluated.

A dynamic-programming approach would probably beat the BFS but BFS is quite fast enough for the published puzzles I've solved. It might also be possible to work from both ends towards the middle.

I don't have enough experience with the puzzle to know if 5 words always suffice for published puzzles, but it seems likely. The two real puzzles I have tried have 2-word solutions from this program (13 characters, the minimum possible length for 2 words), while they were suggested to be solvable in 5 and 6 steps.

When it comes to randomly selected puzzles, there are possible boards for which I find a best answer of 7 words (using a modified version of the program)

knezcuvamybx
68 candidate words
zany - yak - kc - can - numb - beaux - xv 7/23
zany - ye - exec - cab - bevy - yuk - km 7/22
zany - yuk - km - me - eve - exec - cab 7/21
Checked 3516880 sequences

and other sequences which have no solutions up to 7 words and use a lot of RAM and time before giving no solution (11GB peak resident size, 16 seconds):

mtijacfshpwk
194 candidate words
Checked 636279365 sequences

I didn't go full dynamic-programming but I did track the best way to each each of the 12*4096 states and stop recursing if the new candidate doesn't reach a known state faster. This runs much faster and uses less memory; the 11GB & 16s example with no solution above is now <4MB and <.01s! Runs below are with that version. (and apparently using a different dictionary, some runs were on debian oldstable and some on debian stable)

mtijacfshpwk
214 candidate words
Checked 1944 sequences

I also found that there are 8-word puzzles:

kmujpocaziqh
69 candidate words
jam - ma - aqua - ah - ho - oz - zip - pick 8/22
Checked 854 sequences
and 9+-word but I don't like some of the words (and the app doesn't accept 2-letter words):
dnijzaexkyrc
153 candidate words
jerk - kc - ca - ax - xi - icky - yd - dz - zen 9/23
Checked 1038 sequences

Source code (GPL-3.0 license) (build with g++ -std=c++20 -O2):

Faster version:

Original version:

[permalink]

21 December 2022, 15:12 UTC

A quick example of transforming Python with libcst


I had occasion to encounter a Python library that used assert with a side effect:

assert initialize_hardware(), "hardware failed to initialize"
looking a bit more widely, this idiom was apparently used hundreds of times across a family of Python libraries.

"Aha", I said, "I bet I can fix this with an automated tool". In this round of investigation, I found LibCST and set about creating a program that would do what was needed, namely, to turn at assert into if not initialize_hardware(): raise RuntimeError("hardware failed to initialize").

While LibCST has an explicit facility for "codemodding", I didn't notice it at first and wrote in terms of transformers, with my own command-line driver program.

Unfortunately, while my transformer succeeded, attempting to format the CST back into code would result in an error without very many matches on my favorite search engine: Unexpected keyword argument 'default_semicolon'. That linked issue didn't provide an answer, but my further investigation did.

In the Python grammer as represented by LibCST, an assert statement is part of a SimpleStatementLine, while an if statement is not wrapped in a SimpleStatementLine. So if the transformation of an Assert node into an If node is done alone, the new If node lies inside a SimpleStatementLine node, and that is not valid. The problem is not detected until rendering the CST back into code. (It may be possible that using type checking would have found a problem, as this is essentially a type error)

The solution that I arrived at was to also transform any SimpleStatementLine which ended up containing an If node, by using the FlattenSentinel to do it. I think it might have been even more correct to directly perform the transformation within SimpleStatementLine, but what I ended up works now.

[permalink]

31 August 2022, 2:14 UTC

Don't wreck your system with miniconda/anaconda


I guess this software is a tolerable way to install those libraries and packages needed for so many machine learning things written in Python. But annoyingly, it wants to "go to the head of the line" in front of system Python. This is reallllyyyy not what I want.

I noticed that the little blob it deposits in ~/.bashrc can easily be surrounded with a function definition. So, now to activate anaconda in the current shell, but never replace/hide system python in a normal shell, I can just type "fml".

fml () {
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/jepler/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/home/jepler/miniconda3/etc/profile.d/conda.sh" ]; then
        . "/home/jepler/miniconda3/etc/profile.d/conda.sh"
    else
        export PATH="/home/jepler/miniconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<
}

Now I don't feel quite so worried that having it present on the system is going to interfere with system software or with software I've installed to work with system software via pip.

[permalink]

5 October 2021, 23:41 UTC

My experience adding type annotations to a 2.5k-line Python library


The wwvb package for Python has been a focus of my recent hobby-time programming. I've used it as a place to educate myself about the ins and outs of maintaining a Python package. In the past, I used it to learn about using pylint, black & code coverage to improve the quality of Python code. Most recently, I added type annotations through the whole package until mypy --strict was happy with the whole wwvb package and uwwvb module.

The annotations were added in two steps: See pull requests #7 and #8. Together, these PRs contained 320 insertions and 223 deletions across 14 python files, plus 6 insertions in 2 other files related to CI. I did the work during a part of a day, probably under 4 hours of time spent. Since the package currently contains exactly 2500 physical lines of Python code, adding type annotations touched or added over 10% of physical lines!

read more…

18 August 2021, 19:03 UTC

Using Adafruit Macropad as LinuxCNC Control Pendant


Update, 2021-09-25: For compatibility with CircuitPython 7.0.0


CircuitPython recently gained the power to have custom USB descriptors. With these, we can define a USB HID device that will work with LinuxCNC's hal_input.

For instance, the Adafruit Macropad has a (very coarse, just 20 detents/revolution) encoder, 12 keyswitch positions, and an OLED screen.

The two pieces of software below, when placed in the CIRCUITPY drive as boot.py and code.py configure it for use with hal_input, using a halcmd line similar to loadusr -W hal_input Macropad. I haven't actually done the work of hooking it all the way up to Touchy yet, but it causes all the buttons & the encoder to appear in halcmd show pin.

This is just the device I picked first; there's nothing to prevent you from hooking up more exotic things like voltage/temperature monitors through added CircuitPython code. Addition of output reports for status indicators is left for the motivated reader.

read more…

24 June 2021, 1:43 UTC

Quick CircuitPython Driver for ES100 WWVB Receiver


I picked up an ES100 WWVB receiver kit and wrote a quick & dirty library to interact with it in CircuitPython.

I'm not super thrilled with how the chip works; I imagined that the date & time registers would act like an RTC after a successful reception, but instead they just mark the second when reception & decoding completed and are cleared to zero as soon as a new reception attempt is kicked off.

Still, I'll have to figure out a clock to put it inside. I am still thinking of doing an edge-lit display version of the Roman Solar Clock, so maybe that's where it'll go.

The library is jepler_es100.py and the example is code_es100.py (rename to code.py). I ran it on a Feather nRF52840 Expess with CircuitPython 6.3, but it should work on a range of boards.

Because the ES100 just locks up the I2C bus if you "repeated-start" it, I had to use my custom rolled register library instead of adafruit_register. I did build it on top of adafruit_bus_device.

Files currently attached to this page:

code_es100.py1.1kB
jepler_es100.py3.3kB

[permalink]

16 August 2020, 18:05 UTC

Si5351 Frequency Planner in Python


The Si5351 and related clock generators are very flexible, but they are a bit cumbersome to "program".

The basic design of the Si5351 is that an incoming frequency is first multiplied to create a high internal frequency (nominally in the range from 600MHz to 900MHz), then divided to create an output frequency. The multipliers and dividers have restrictions on their ranges, there are just 2 PLLs but 3 output clocks, and certain types of configurations (e.g., "divisor is an integer multiple of 2") are said to give lower jitter outputs than others.

The datasheet advising users on how to create their own register values is very complicated and some of the parts resist comprehension even after multiple readings. Thie chip maker Silicon Labs provides a graphical closed source Windows program for generating register maps, though some internet commenters regard it as buggy.

This program represents my own effort to create a frequency planner. It neglects some datasheet rules, mostly related to output frequencies above 50MHz or so. It tries to

  • Favor lower-jitter configurations where possible
  • Favor making the first-listed frequency as accurate as possible
  • Try each combination of ways to allocate output clocks to the PLLs
  • Obey the datasheet restrictions as I've understood them
  • Do all arithmetic as infinite precision arithmetic, not floating point
It does not
  • Implement the divisor rules for extremely fast clocks
  • Implement the divisor rules for higher numbered outputs on 20QFN packages

For exact input clocks such as 25MHz, it is surprisingly hard to find a "plausible" frequency that is inexact, and even harder to find a "plausible" frequency that is less exact than your reference clock. I didn't actually find a case where the error is not much smaller than the frequency stability of any reference I'd plausibly have access to.

(Of course, if you measure your reference clock as 25MHz + 13.37Hz and plug that in as the --input-freq then almost everything is related to that clock inexactly, so the fact that the clocks will still be really, really close to the requested value is very nice.)

It does not directly create a register map, but the values shown are easy enough to convert to the form used in the CircuitPython adafruit_si5151 library. Sadly, as CircuitPython does not support the fractions module, it's not currently feasible to run this code on a CircuitPython board directly controlling the si5351 chip.

Example:

Generate 315M/88 (NTSC colorburst), 4.43361875M (PAL colour carrier), 25.8048M (UART clock) all exactly from a 25MHz reference clock or crystal. However, the PAL colour carrier will have more jitter since it uses a fractional divisor:

$ ./party.py 315M/88 4.43361875M 25.8048M
Input frequency: 25000000 (25000000.0)
Frequency plan score: 10.38

PLL A:
Frequency plan type: Fractional multiplier, double integer divisor (1)
Multiplier = 126/5 (25.2)
Intermediate frequency = 630000000 (630000000.0)

Desired output frequency: 39375000/11 (3579545.4545454546)
Divider = 176 (176.0)
Exact
r_divider = 0 (/ 1)


PLL B:
Frequency plan type: Fractional multiplier, fractional divisor (5)
Multiplier = 12059443/500000 (24.118886)
Intermediate frequency = 602972150 (602972150.0)

Desired output frequency: 17734475/4 (4433618.75)
Divider = 136 (136.0)
Exact
r_divider = 0 (/ 1)

Desired output frequency: 25804800 (25804800.0)
Divider = 12059443/516096 (23.366666279141864)
Exact
r_divider = 0 (/ 1)

Generate pi MHz from a 25MHz reference clock or crystal (error: 27.8 nano Hz)

$ ./party.py 3.1415926535898M
Input frequency: 25000000 (25000000.0)
Frequency plan score: 1.00

PLL A:
Frequency plan type: Fractional multiplier, fractional divisor (5)
Multiplier = 24 (24.0)
Intermediate frequency = 600000000 (600000000.0)

Desired output frequency: 15707963267949/5000000 (3141592.6535898)
Divider = 40306053/211042 (190.98593171027568)
Actual output frequency: 42208400000000/13435351 (3141592.653589772)
Relative Error: -8.83757e-15
Absolute Error: 2.78e-08Hz
r_divider = 0 (/ 1)

The source is available in a github gist:

[permalink]

3 June 2019, 1:17 UTC

Precision vs Accuracy: A Clock


I was inspired by this watch face design (I think that's the original version) and by the arrival of a "OCXO", a very reliable time keeping circuit, to finally make an electronic clock.

Accuracy: Between hardware and software tuning, the OCXO keeps time with an accuracy of possibly better than 100 microseconds per day (loses or gains well less than a half second per year) (Yes, I'm deliberately ignoring a lot about crystal aging here!)

Precision: The time displayed is to the nearest minute, and the touchscreen setting mechanism is (deliberately?) poor, making it hard to set the time closer than +- 2 minutes or so. Oh, and it takes a good fraction of a second to update the screen anytime it changes. (The best way to set it seems to be to wait until a few seconds before 6AM/6PM and plug it in, since it boots with that time showing)

The clock consists of: .. all in a 3d printed enclosure that's not quite the right size.


The dial as an animation (1 revolution = 12 hours)
Along the way, I added an even more accurate time source to a Raspberry PI (GPS with PPS) so that I could even measure the accuracy of the OCXO, and discovered I even have a GPS module which was negatively affected by the GPS rollover that occurred in April of this year (the second 1024-week rollover). This leads to a surprising sequence of clock arithmetic, and finally gpsd decides the GPS is returning a date sometime back in 1963.

[permalink]

19 February 2014, 21:17 UTC

Red: the perfect unit of measurement?

17 January 2013, 16:34 UTC

moinmoin cleanup script

1 December 2012, 1:12 UTC

Android "Birds of Australia" unpacker

9 October 2012, 2:30 UTC

FTL (un)packer for Linux

All older entries
Website Copyright © 2004-2024 Jeff Epler