Hi *,
I'd like a method to retrieve the complete key structure from a
json string and I'm using the json tcllib module.
It may contain an unknown number of nested levels and json arrays.
I have found some solutions which work basically I cannot
distinguish reliable between a normal - maybe nested json
object - and a json array.
Hi *,
I'd like a method to retrieve the complete key structure from a
json string and I'm using the json tcllib module.
It may contain an unknown number of nested levels and json arrays.
I have found some solutions which work basically I cannot
distinguish reliable between a normal - maybe nested json
object - and a json array.
All found examples fail json array arrays.
The extracted or created keys may eventually be used for accessing the corresponding values - json does not create specific ids for the array entries, they are a list in the tcl's point of view.
Has anyone tried or mastered this challenge?
Here's a typical example:
---
https://www.tech-edv.co.at/download/testdata/livedata_20250914.txt
On 15/09/2025 17:58, Gerhard Reithofer wrote:
Hi *,
I'd like a method to retrieve the complete key structure from a
json string and I'm using the json tcllib module.
It may contain an unknown number of nested levels and json arrays.
I have found some solutions which work basically I cannot
distinguish reliable between a normal - maybe nested json
object - and a json array.
All found examples fail json array arrays.
The extracted or created keys may eventually be used for accessing the
corresponding values - json does not create specific ids for the array
entries, they are a list in the tcl's point of view.
Has anyone tried or mastered this challenge?
Here's a typical example:
---
https://www.tech-edv.co.at/download/testdata/livedata_20250914.txt
I'm by no means a JSON expert, in natural language terms I'd describe
myself as speaking schoolboy JSON, so I may have overlooked some
technical subtlety. As far as I can see
package require json
package require http
package require tls
http::register https 443 {tls::socket -autoservername true}
set tok [http::geturl
https://www.tech-edu.co.at/download/testdata/livedata_20290914.txt]
set jsStr [http::data $tok]
http::cleanup $tok
set jsDict [json::json2dict $jsStr]
delivers a completely usable dictionary.
dict get $jsDict inverters
returns a two-element list which is analogous, in many programming >languages, to an array with two valid indexes.
Sorry if I've missed something.
Alan
Gerhard Reithofer <g.reithofer@tech-edv.co.at> wrote:
Hi *,
This is because the tcllib json module does not return any typing information, so you have to guess as to whether you have an "object" or
an "array" at any given level.
On 15/09/2025 17:58, Gerhard Reithofer wrote:
Hi *,
I'd like a method to retrieve the complete key structure from a
json string and I'm using the json tcllib module.
It may contain an unknown number of nested levels and json arrays.
delivers a completely usable dictionary.
dict get $jsDict inverters
returns a two-element list which is analogous, in many programming languages, to an array with two valid indexes.
In article <10abki9$2ift8$1@dont-email.me>,
Alan Grunwald <nospam.nurdglaw@gmail.com> wrote:
On 15/09/2025 17:58, Gerhard Reithofer wrote:
Hi *,
I'd like a method to retrieve the complete key structure from a
The disadvantage being that it's not a standard part of tcllib.
https://github.com/RubyLane/rl_json
...
The disadvantage being that it's not a standard part of tcllib.
https://github.com/RubyLane/rl_json
unfortunately I haven't found the complate documentation online - any
hint?
On Tue, 16 Sep 2025, Ted Nolan <tednolan> wrote:
Hi Ted,
In article <10abki9$2ift8$1@dont-email.me>,
Alan Grunwald <nospam.nurdglaw@gmail.com> wrote:
On 15/09/2025 17:58, Gerhard Reithofer wrote:
Hi *,
I'd like a method to retrieve the complete key structure from a
...
The disadvantage being that it's not a standard part of tcllib.
https://github.com/RubyLane/rl_json
unfortunately I haven't found the complate documentation online - any
hint?
On Tue, 16 Sep 2025, Ted Nolan <tednolan> wrote:
Hi Ted,
In article <10abki9$2ift8$1@dont-email.me>,
Alan Grunwald <nospam.nurdglaw@gmail.com> wrote:
On 15/09/2025 17:58, Gerhard Reithofer wrote:
Hi *,
I'd like a method to retrieve the complete key structure from a
...
The disadvantage being that it's not a standard part of tcllib.
https://github.com/RubyLane/rl_json
unfortunately I haven't found the complate documentation online - any
hint?
BTW I think that this problem can be solved with various tools, I have
only tried it with tcllib json and I'm a fan of tcl-only
implementations because tcl is availale on many platform and then you
need not to make anything except installing of tcl and if necessary
copy a bunch of files.
Thank you,
Gerhard
On Tue, 16 Sep 2025, Ted Nolan <tednolan> wrote:
Hi Ted,
In article <10abki9$2ift8$1@dont-email.me>,
Alan Grunwald <nospam.nurdglaw@gmail.com> wrote:
On 15/09/2025 17:58, Gerhard Reithofer wrote:
Hi *,
I'd like a method to retrieve the complete key structure from a
...
The disadvantage being that it's not a standard part of tcllib.
https://github.com/RubyLane/rl_json
unfortunately I haven't found the complate documentation online - any
hint?
BTW I think that this problem can be solved with various tools, I have
only tried it with tcllib json and I'm a fan of tcl-only
implementations because tcl is availale on many platform and then you
need not to make anything except installing of tcl and if necessary
copy a bunch of files.
Thank you,
Gerhard
On 9/16/2025 12:50 PM, Gerhard Reithofer wrote:
On Tue, 16 Sep 2025, Ted Nolan <tednolan> wrote:
Hi Ted,
I posed this problem to the claude AI, and it agrees with your findings of loss of type info. Claude's suggestion was to preparse the json string to find
array keys, and supplied the following code using regex's:
package require json
proc findArrayKeys {jsonString {path {}}} {
set arrayKeys {}
# Remove whitespace for easier parsing
set json [string map {"\n" "" "\t" "" " " " "} $jsonString]
# Find "key": [ patterns (arrays)
set pattern {"([^"]+)"\s*:\s*\[}
set start 0
while {[regexp -start $start -indices $pattern $json match keyIndices]} {
set key [string range $json {*}$keyIndices]
if {$path ne ""} {
lappend arrayKeys "$path.$key"
} else {
lappend arrayKeys $key
}
set start [expr {[lindex $match 1] + 1}]
}
return $arrayKeys
}
# Example usage
set jsonData {{"items": ["a", "b", "c"], "single": "hello", "nested": {"subitems": ["x", "y"]}}}
set parsed [::json::json2dict $jsonData]
set arrayKeys [findArrayKeys $jsonData]
puts "Array keys: $arrayKeys"
foreach {key value} $parsed {
if {$key in $arrayKeys} {
puts "$key is an array: $value"
} else {
puts "$key is not an array: $value"
}
}
On Tue, 16 Sep 2025, et99 wrote:
On 9/16/2025 12:50 PM, Gerhard Reithofer wrote:
On Tue, 16 Sep 2025, Ted Nolan <tednolan> wrote:
Hi Ted,
...
I posed this problem to the claude AI, and it agrees with your findings of >> loss of type info. Claude's suggestion was to preparse the json string to find
array keys, and supplied the following code using regex's:
package require json
proc findArrayKeys {jsonString {path {}}} {
set arrayKeys {}
# Remove whitespace for easier parsing
set json [string map {"\n" "" "\t" "" " " " "} $jsonString]
# Find "key": [ patterns (arrays)
set pattern {"([^"]+)"\s*:\s*\[}
set start 0
while {[regexp -start $start -indices $pattern $json match keyIndices]} {
set key [string range $json {*}$keyIndices]
if {$path ne ""} {
lappend arrayKeys "$path.$key"
} else {
lappend arrayKeys $key
}
set start [expr {[lindex $match 1] + 1}]
}
return $arrayKeys
}
# Example usage
set jsonData {{"items": ["a", "b", "c"], "single": "hello", "nested":
{"subitems": ["x", "y"]}}}
set parsed [::json::json2dict $jsonData]
set arrayKeys [findArrayKeys $jsonData]
puts "Array keys: $arrayKeys"
foreach {key value} $parsed {
if {$key in $arrayKeys} {
puts "$key is an array: $value"
} else {
puts "$key is not an array: $value"
}
}
Really interesting approach good looks good.
Not mentioning that all entities must be reparsed recursively it could
be a solution.
On Mon, 15 Sep 2025, Rich wrote:
[Rich proposed tDOM]
Thanks, but this is rather powerful but also heavy set for that "simple" problem.
But if I find no simple solution in short time I will come back to it
:-)
To provide a small example:
package require tdom 0.9.6
set jsondata {{
"stringproperty": "abc",
"objectproperty": {"one": 1, "two": "two"},
"array": ["foo", -2.23, null, true, false, {"one": 1, "two": "two"}, [2,16,24]],
"number": 2022,
"null": null,
"true": true,
"false": false
}}
dom parse -json $jsondata doc
puts [$doc asTypedList]
$doc delete
OBJECT {stringproperty {STRING abc} objectproperty {OBJECT {one {NUMBER 1} two {STRING two}}} array {ARRAY {{STRING foo} {NUMBER -2.23} NULL TRUE FALSE {OBJECT {one {NUMBER 1} two {STRING two}}} {ARRAY {{NUMBER 2} {NUMBER 16} {NUMBER 24}}}}} number {NUMBER 2022} null NULL true TRUE false FALSE}
rolf
On 9/17/2025 3:00 PM, Rolf Ade wrote:
To provide a small example:
package require tdom 0.9.6
set jsondata {{
"stringproperty": "abc",
"objectproperty": {"one": 1, "two": "two"},
"array": ["foo", -2.23, null, true, false, {"one": 1, "two": "two"}, [2,16,24]],
"number": 2022,
"null": null,
"true": true,
"false": false
}}
dom parse -json $jsondata doc
puts [$doc asTypedList]
$doc delete
OBJECT {stringproperty {STRING abc} objectproperty {OBJECT {one
{NUMBER 1} two {STRING two}}} array {ARRAY {{STRING foo} {NUMBER
-2.23} NULL TRUE FALSE {OBJECT {one {NUMBER 1} two {STRING two}}}
{ARRAY {{NUMBER 2} {NUMBER 16} {NUMBER 24}}}}} number {NUMBER 2022}
null NULL true TRUE false FALSE}
rolf
Nice example Rolf!
As another AI experiment, I fed that (both input and output) into
Claude along with a url to the flask wiki page, and he was able, with
some help from me, to create a compatible json parser.
I ran his code which includes several examples on my rasp pi with tcl
8.6.2 - fearlessly I should add, since it has only needed to be
restarted 3 times in 6 years. Current uptime 250 days! It reproduced
Rolf's example exactly.
[...]
It's in the last example here (typed-json):
https://wiki.tcl-lang.org/page/flask+a+mini%2Dflex%2Flex+proc?v=164
(w/o the explicit version, it tends to bring up earlier versions, dunno why)
Gerhard Reithofer <g.reithofer@tech-edv.co.at> writes:
On Mon, 15 Sep 2025, Rich wrote:
[Rich proposed tDOM]
Thanks, but this is rather powerful but also heavy set for that "simple" problem.
If heavy means using a binary extension then of course yes.
Seems, you have already one. And otherwise I would not have known a
place to get tdom 0.9.6 precompiled for pi ... ;-)
et99 <et99@rocketship1.me> writes:
On 9/17/2025 3:00 PM, Rolf Ade wrote:
To provide a small example:
package require tdom 0.9.6
set jsondata {{
"stringproperty": "abc",
"objectproperty": {"one": 1, "two": "two"},
"array": ["foo", -2.23, null, true, false, {"one": 1, "two": "two"}, [2,16,24]],
"number": 2022,
"null": null,
"true": true,
"false": false
}}
dom parse -json $jsondata doc
puts [$doc asTypedList]
$doc delete
OBJECT {stringproperty {STRING abc} objectproperty {OBJECT {one
{NUMBER 1} two {STRING two}}} array {ARRAY {{STRING foo} {NUMBER
-2.23} NULL TRUE FALSE {OBJECT {one {NUMBER 1} two {STRING two}}}
{ARRAY {{NUMBER 2} {NUMBER 16} {NUMBER 24}}}}} number {NUMBER 2022}
null NULL true TRUE false FALSE}
rolf
Nice example Rolf!
As another AI experiment, I fed that (both input and output) into
Claude along with a url to the flask wiki page, and he was able, with
some help from me, to create a compatible json parser.
I ran his code which includes several examples on my rasp pi with tcl
8.6.2 - fearlessly I should add, since it has only needed to be
restarted 3 times in 6 years. Current uptime 250 days! It reproduced
Rolf's example exactly.
[...]
It's in the last example here (typed-json):
https://wiki.tcl-lang.org/page/flask+a+mini%2Dflex%2Flex+proc?v=164
(w/o the explicit version, it tends to bring up earlier versions, dunno why)
Fine start. Picks up my idea of converting a json file into a nested
list with type/value information. And even provides some sample/helper
Tcl procs to work with the datastructure.
Though, if I'm not mistaken, the parser do not do any json unescaping
(\n, \t etc, and the other characters < 0x20 and the escaping of
characters outside the BMP). tDOMs json parser does all this. If this
could be added to the scripted json parser this could be a fine Tcl only solution for people who have to care about json types.
rolf
"user": {
"name": "Alice\nwith newline\u2022 <- unicode \" imbedded quote",
"age": 30,
"contacts": {
"email": "alice@example.com",
"phone": "555-1234"
}
},
"products": [
{"name": "Widget", "price": 19.99},
{"name": "Gadget", "price": 29.99}
],
"settings": {
"debug": true,
"timeout": 5000
}
}}]
On 9/19/2025 5:43 AM, Rolf Ade wrote:
And even provides some sample/helper
Tcl procs to work with the datastructure.
On 9/19/2025 5:43 AM, Rolf Ade wrote:
et99 <et99@rocketship1.me> writes:
On 9/17/2025 3:00 PM, Rolf Ade wrote:
To provide a small example:
package require tdom 0.9.6
set jsondata {{
"stringproperty": "abc",
"objectproperty": {"one": 1, "two": "two"},
"array": ["foo", -2.23, null, true, false, {"one": 1, "two": "two"}, [2,16,24]],
"number": 2022,
"null": null,
"true": true,
"false": false
}}
dom parse -json $jsondata doc
puts [$doc asTypedList]
$doc delete
OBJECT {stringproperty {STRING abc} objectproperty {OBJECT {one
{NUMBER 1} two {STRING two}}} array {ARRAY {{STRING foo} {NUMBER
-2.23} NULL TRUE FALSE {OBJECT {one {NUMBER 1} two {STRING two}}}
{ARRAY {{NUMBER 2} {NUMBER 16} {NUMBER 24}}}}} number {NUMBER 2022}
null NULL true TRUE false FALSE}
rolf
Nice example Rolf!
As another AI experiment, I fed that (both input and output) into
Claude along with a url to the flask wiki page, and he was able, with
some help from me, to create a compatible json parser.
I ran his code which includes several examples on my rasp pi with tcl
8.6.2 - fearlessly I should add, since it has only needed to be
restarted 3 times in 6 years. Current uptime 250 days! It reproduced
Rolf's example exactly.
[...]
It's in the last example here (typed-json):
https://wiki.tcl-lang.org/page/flask+a+mini%2Dflex%2Flex+proc?v=164
(w/o the explicit version, it tends to bring up earlier versions, dunno why)
Fine start. Picks up my idea of converting a json file into a nested
list with type/value information. And even provides some
sample/helper Tcl procs to work with the datastructure.
Though, if I'm not mistaken, the parser do not do any json unescaping
(\n, \t etc, and the other characters < 0x20 and the escaping of
characters outside the BMP). tDOMs json parser does all this. If this
could be added to the scripted json parser this could be a fine Tcl
only solution for people who have to care about json types. rolf
Thanks for looking at it Rolf.
I tried this change on the last example in the code from a windows console (some blank lines added here for clarity):
% set sampleData [typed_json::json2dict {{
"user": {
"name": "Alice\nwith newline\u2022 <- unicode \" imbedded quote",
"age": 30,
"contacts": {
"email": "alice@example.com",
"phone": "555-1234"
}
},
"products": [
{"name": "Widget", "price": 19.99},
{"name": "Gadget", "price": 29.99}
],
"settings": {
"debug": true,
"timeout": 5000
}
}}]
OBJECT {user {OBJECT {name {STRING {Alice\nwith newline\u2022 <- unicode \" imbedded quote}} age {NUMBER 30} contacts {OBJECT {email {STRING alice@example.com} phone {STRING 555-1234}}}}} products {ARRAY {{OBJECT {name {STRING Widget} price {NUMBER 19.99}}} {OBJECT {name {STRING Gadget} price {NUMBER 29.99}}}}} settings {OBJECT {debug TRUE timeout {NUMBER 5000}}}}
% typed_json::getPath $sampleData user.name
STRING {Alice\nwith newline\u2022 <- unicode \" imbedded quote}
% puts "user.name = |[subst -nocommands -novariables [typed_json::getValue [typed_json::getPath $sampleData user.name]]]|"
user.name = |Alice
with newline• <- unicode " imbedded quote|
Is this what you're meaning here?
By BMP, do you mean, the unicode Basic Multilingual Plane (BMP)? I had
to look that one up :)
I wonder if the getValue utility command might not just use [subst] as
above so this would be done automatically.
I see this in the tDOM manual:
dom jsonEscape string
Returns the given string argument escaped in a way that if the
returned string is used literary in a JSON document it is read by
any confirming JSON parser as the original string.
I don't quite understand what this does. Is this going the reverse
direction? Does tDOM actually store the value as a unicode text
string?
On 9/19/2025 11:29 AM, et99 wrote:
On 9/19/2025 5:43 AM, Rolf Ade wrote:
And even provides some sample/helper
Tcl procs to work with the datastructure.
I told Claude and he created a little manual in tcl wiki format, it's now here:
https://wiki.tcl-lang.org/page/typed%2Djson
Boy, I could have used his help a decade or so ago when I was still on the job :)
et99 <et99@rocketship1.me> writes:
On 9/19/2025 5:43 AM, Rolf Ade wrote:
et99 <et99@rocketship1.me> writes:
On 9/17/2025 3:00 PM, Rolf Ade wrote:
To provide a small example:
package require tdom 0.9.6
set jsondata {{
"stringproperty": "abc",
"objectproperty": {"one": 1, "two": "two"},
"array": ["foo", -2.23, null, true, false, {"one": 1, "two": "two"}, [2,16,24]],
"number": 2022,
"null": null,
"true": true,
"false": false
}}
dom parse -json $jsondata doc
puts [$doc asTypedList]
$doc delete
OBJECT {stringproperty {STRING abc} objectproperty {OBJECT {one
{NUMBER 1} two {STRING two}}} array {ARRAY {{STRING foo} {NUMBER
-2.23} NULL TRUE FALSE {OBJECT {one {NUMBER 1} two {STRING two}}}
{ARRAY {{NUMBER 2} {NUMBER 16} {NUMBER 24}}}}} number {NUMBER 2022}
null NULL true TRUE false FALSE}
rolf
Nice example Rolf!
As another AI experiment, I fed that (both input and output) into
Claude along with a url to the flask wiki page, and he was able, with
some help from me, to create a compatible json parser.
I ran his code which includes several examples on my rasp pi with tcl
8.6.2 - fearlessly I should add, since it has only needed to be
restarted 3 times in 6 years. Current uptime 250 days! It reproduced
Rolf's example exactly.
[...]
It's in the last example here (typed-json):
https://wiki.tcl-lang.org/page/flask+a+mini%2Dflex%2Flex+proc?v=164
(w/o the explicit version, it tends to bring up earlier versions, dunno why)
Fine start. Picks up my idea of converting a json file into a nested
list with type/value information. And even provides some
sample/helper Tcl procs to work with the datastructure.
Though, if I'm not mistaken, the parser do not do any json unescaping
(\n, \t etc, and the other characters < 0x20 and the escaping of
characters outside the BMP). tDOMs json parser does all this. If this
could be added to the scripted json parser this could be a fine Tcl
only solution for people who have to care about json types. rolf
Thanks for looking at it Rolf.
I tried this change on the last example in the code from a windows console (some blank lines added here for clarity):
% set sampleData [typed_json::json2dict {{
"user": {
"name": "Alice\nwith newline\u2022 <- unicode \" imbedded quote", >>> "age": 30,
"contacts": {
"email": "alice@example.com",
"phone": "555-1234"
}
},
"products": [
{"name": "Widget", "price": 19.99},
{"name": "Gadget", "price": 29.99}
],
"settings": {
"debug": true,
"timeout": 5000
}
}}]
OBJECT {user {OBJECT {name {STRING {Alice\nwith newline\u2022 <- unicode \" imbedded quote}} age {NUMBER 30} contacts {OBJECT {email {STRING alice@example.com} phone {STRING 555-1234}}}}} products {ARRAY {{OBJECT {name {STRING Widget} price {NUMBER 19.99}}} {OBJECT {name {STRING Gadget} price {NUMBER 29.99}}}}} settings {OBJECT {debug TRUE timeout {NUMBER 5000}}}}
% typed_json::getPath $sampleData user.name
STRING {Alice\nwith newline\u2022 <- unicode \" imbedded quote}
% puts "user.name = |[subst -nocommands -novariables [typed_json::getValue [typed_json::getPath $sampleData user.name]]]|"
user.name = |Alice
with newline• <- unicode " imbedded quote|
Is this what you're meaning here?
Yes. The value is the un-escaped string. Your tool provide atm the
string literally as in the json serialization (written escaped), not the
json data.
By BMP, do you mean, the unicode Basic Multilingual Plane (BMP)? I had
to look that one up :)
Yes. Sorry for being terse.
I wonder if the getValue utility command might not just use [subst] as
above so this would be done automatically.
I'm afraid it is not quite that simple, although at first it seems to
solve most of the escaping inside the BMP. But Tcl has more escape
sequences than json. If you just [subst -nocommands -novariables]
something like "\b" (the two characters \ and b, not \u0008) in the json
data will be wrongly substituted.
And then there is still the escaping of characters outside the BMP.
I see this in the tDOM manual:
dom jsonEscape string
Returns the given string argument escaped in a way that if the
returned string is used literary in a JSON document it is read by
any confirming JSON parser as the original string.
I don't quite understand what this does. Is this going the reverse
direction? Does tDOM actually store the value as a unicode text
string?
This method expects a string as argument and returns that string escaped
as a json string - that characters which would represent this string in
an (escaped) json string.
Dunno how to explain that better. Perhaps an example. Often, REST
interfaces expect a small piece of json for the request (the answer may
be long or short json). Say, the query json looks like
{
"credential": "mysecret",
"question": "<user input>"
}
There are means in tDOM to build up a json document from scratch but for
such a small snippet they may seem a bit cumbersom (they aren't for
greater json vocabularies and documents but that is another story not
told yet). It is tempting to use subst
subst -novariables -nocommands {{
"credential": "mysecret",
"question": "$userinput"
}
This fails short because of the escaping requirements for the string and
this is what [dom jsonEscape] does.
rolf
proc convertEscapes {str} {
set result ""
set i 0
set len [string length $str]
while {$i < $len} {
if {[string index $str $i] eq "\\"} {
set next [string index $str [expr {$i+1}]]
switch $next {
n { append result \n; incr i 2 }
t { append result \t; incr i 2 }
r { append result \r; incr i 2 }
b { append result \b; incr i 2 }
f { append result \f; incr i 2 }
\" { append result \"; incr i 2 }
/ { append result /; incr i 2 }
\\ { append result \\; incr i 2 }
u {
# Unicode escape
set hex [string range $str [expr {$i+2}] [expr {$i+5}]]
append result [format %c 0x$hex]
incr i 6
}
default {
# Unknown escape, keep as-is
append result \\$next
incr i 2
}
}
} else {
append result [string index $str $i]
incr i
}
}
return $result
}
proc convertEscapes {str} {
if {[string first "\\" $str] < 0} {return $str}
I added a copyright blurb at the top. According to discussion with
their chat bot, anything Claude creates with me is mine to do with as
I please, except I can't use his output to train their competitor's
bots. I asked about making it public, and what if someone else
trains off the public (wiki) pages. They then repeated the
condition. So, I guess I stumped them on that one.
I'd recommend posting a verbatim copy of whatever "license" they give
you on the page.
Claude is from a commercial entity, so buried inside that "license"
that allows you to "do as you please" today will be a clause that
allows them, at any time, to unilaterially alter the terms of the deal.
So immortaize the terms of the deal, as you know them today, on the
page with the rest of the code. Then you've got a record of what
"that" bit of code's terms was, today, irrespective if what the company laywers might decide to do in the future.
# typed_json - JSON Parser with Type Preservation
# Copyright (c) 2025 et99
#
# This software was developed with assistance from Claude AI (Anthropic).
On 9/19/2025 2:48 PM, Rolf Ade wrote:[...]
et99 <et99@rocketship1.me> writes:
I wonder if the getValue utility command might not just use [subst] as
above so this would be done automatically.
I'm afraid it is not quite that simple, although at first it seems to
solve most of the escaping inside the BMP. But Tcl has more escape
sequences than json. If you just [subst -nocommands -novariables]
something like "\b" (the two characters \ and b, not \u0008) in the
json data will be wrongly substituted. And then there is still the
escaping of characters outside the BMP.
Ok, been talking this over with Claude (gosh, amazing, like talking to
a real person, except he writes code like a furious demon).
He is eager to make changes. We would have a convert proc, like so:
Have only tested for: set foo2 {hello\u2022 \nworld}
which worked.
proc convertEscapes {str} {
set result ""
set i 0
set len [string length $str]
while {$i < $len} {
if {[string index $str $i] eq "\\"} {
set next [string index $str [expr {$i+1}]]
switch $next {
n { append result \n; incr i 2 }
t { append result \t; incr i 2 }
r { append result \r; incr i 2 }
b { append result \b; incr i 2 }
f { append result \f; incr i 2 }
\" { append result \"; incr i 2 }
/ { append result /; incr i 2 }
\\ { append result \\; incr i 2 }
u {
# Unicode escape
set hex [string range $str [expr {$i+2}] [expr {$i+5}]]
append result [format %c 0x$hex]
incr i 6
}
default {
# Unknown escape, keep as-is
append result \\$next
incr i 2
}
}
} else {
append result [string index $str $i]
incr i
}
}
return $result
}
et99 <et99@rocketship1.me> writes:
On 9/19/2025 2:48 PM, Rolf Ade wrote:[...]
et99 <et99@rocketship1.me> writes:
I wonder if the getValue utility command might not just use [subst] as >>>> above so this would be done automatically.
I'm afraid it is not quite that simple, although at first it seems to
solve most of the escaping inside the BMP. But Tcl has more escape
sequences than json. If you just [subst -nocommands -novariables]
something like "\b" (the two characters \ and b, not \u0008) in the
json data will be wrongly substituted. And then there is still the
escaping of characters outside the BMP.
Ok, been talking this over with Claude (gosh, amazing, like talking to
a real person, except he writes code like a furious demon).
He is eager to make changes. We would have a convert proc, like so:
Have only tested for: set foo2 {hello\u2022 \nworld}
which worked.
Perhaps one of the problems - a too small test suite of 1 test ...
proc convertEscapes {str} {
set result ""
set i 0
set len [string length $str]
while {$i < $len} {
if {[string index $str $i] eq "\\"} {
set next [string index $str [expr {$i+1}]]
switch $next {
n { append result \n; incr i 2 }
t { append result \t; incr i 2 }
r { append result \r; incr i 2 }
b { append result \b; incr i 2 }
f { append result \f; incr i 2 }
\" { append result \"; incr i 2 }
/ { append result /; incr i 2 }
\\ { append result \\; incr i 2 }
u {
# Unicode escape
set hex [string range $str [expr {$i+2}] [expr {$i+5}]] >> append result [format %c 0x$hex]
incr i 6
}
default {
# Unknown escape, keep as-is
append result \\$next
incr i 2
}
}
} else {
append result [string index $str $i]
incr i
}
}
return $result
}
Hm. Well, no.
As I already wrote I'm afraid it is not quite that simple. And it seems neither you nor "Claude" read or understood what I wrote. The proc
presented is kind-of a scripted version of [subst -novariables
-nocommands]. Which is not the solution to the problem at hand.
Additionally it does not solve the problem of escaping non BMP
characters. And the unicode escape branch is flawed.
rolf
On 9/20/2025 9:03 AM, Rich wrote:
I'd recommend posting a verbatim copy of whatever "license" they
give you on the page.
Claude is from a commercial entity, so buried inside that "license"
that allows you to "do as you please" today will be a clause that
allows them, at any time, to unilaterially alter the terms of the
deal.
So immortaize the terms of the deal, as you know them today, on the
page with the rest of the code. Then you've got a record of what
"that" bit of code's terms was, today, irrespective if what the
company laywers might decide to do in the future.
Thanks Rich. This is really tricky. I plan to add this, do you
think this is good enough?
# typed_json - JSON Parser with Type Preservation
# Copyright (c) 2025 et99
#
# This software was developed with assistance from Claude AI (Anthropic).
# Per Anthropic Consumer Terms of Service, Section 4 (as of May 1, 2025),
# as read on September 20, 2025:
# https://www.anthropic.com/legal/consumer-terms
# "Subject to your compliance with our Terms, we assign to you all of our
# right, title, and interest—if any—in Outputs."
#
# [Your existing license text...]
Of course that "subject to your compliance" can be another can of worms".
Nothing remains static for long. Except sadly usenet messages that I
can't edit or delete.
Anyway, Claude did say something about BMP, but I still don't know
what that's really about.
However, here's the latest version of that proc. If you can tell me
what else is wrong, I can try to fix it.
et99 <et99@rocketship1.me> wrote:
Nothing remains static for long. Except sadly usenet messages that I
can't edit or delete.
That has always been the case with Usenet. It's distributed nature
makes the concept of "edit" rather difficult, and while NNTP does
support a "delete", abuse by miscreants some decades ago means that
almost all Usenet servers ignore attempts to "delete" articles as well.
Anyway, Claude did say something about BMP, but I still don't know
what that's really about.
https://en.wikipedia.org/wiki/Basic_Multilingual_Plane
The first 65536 Unicode code points, and at one time, long ago in
history, what was thought to be "enough" code points to encode every character known (that turned out to not be true....).
However, here's the latest version of that proc. If you can tell me
what else is wrong, I can try to fix it.
You should really go pull a copy of the Json spec. and read what it
says about escaping both strings and Unicode characters. That is the definition of how to do it, and just maybe if you fed that part of the
spec to Claude, it would cough up a correct proc (note, I'm not saying
this latest one is correct, or incorrect, as I've not gone and read
through the Json spec. to know).
But thanks for you input, because anything I don't grok I will ask my little friend about :)
You might want to go look into what they mean by "your compliance with
our Terms" before you decide this is enough. That, by itself, is a
loophole a mile wide without further definition of 'compliance'.
On 9/20/2025 6:19 PM, Rich wrote:
Even Claude agrees that this is a problem, he suggests adding this:
You might want to go look into what they mean by "your compliance with
our Terms" before you decide this is enough. That, by itself, is a
loophole a mile wide without further definition of 'compliance'.
# Note: The legal status of AI-generated content and the scope of rights
# assigned under the above terms remain legally uncertain. Use at your own
# discretion.
#
# Users are responsible for ensuring their use complies with applicable
laws
# and third-party terms of service.
On 9/21/2025 12:55 AM, et99 wrote:
On 9/20/2025 6:19 PM, Rich wrote:
Even Claude agrees that this is a problem, he suggests adding this:
You might want to go look into what they mean by "your compliance with
our Terms" before you decide this is enough. That, by itself, is a
loophole a mile wide without further definition of 'compliance'.
# Note: The legal status of AI-generated content and the scope of rights
# assigned under the above terms remain legally uncertain. Use at your own >> # discretion.
#
# Users are responsible for ensuring their use complies with applicable laws >> # and third-party terms of service.
As I understand it, tDOM meets all of the requirements of the OP, with its functionality, built-in conversions, and legalese. I am not seeing the reason for insisting an on AI-generated solution here, which seems to copy tDOM bit by bit and which may introduce "mile-wide" holes for further restrictions on its use. Am I missing something?
On 9/20/2025 7:55 PM, et99 wrote:
But thanks for you input, because anything I don't grok I will ask
my little friend about :)
Ok, my friends new approach, more code, but seems to work, except for
one problem, in 8.6 converting a surrogate pair using format %c does
not work, although it works in tcl 9.
Here's the new code with all the test cases and the output:
#!/usr/bin/env tclsh
catch {console show}
# Convert JSON escape sequences to actual characters
# Handles surrogate pairs for non-BMP Unicode characters
proc convertEscapes {str} {
set i [string first "\\" $str]
if {$i < 0} {return $str}
# Initialize result with everything before first backslash
set result [string range $str 0 [expr {$i - 1}]]
set len [string length $str]
while {$i < $len} {
set c [string index $str $i]
if {$c eq "\\"} {
if {$i + 1 >= $len} {
error "Invalid escape sequence: string ends with backslash"
}
set next [string index $str [expr {$i+1}]]
switch $next {
n { append result \n; incr i 2 }
t { append result \t; incr i 2 }
r { append result \r; incr i 2 }
b { append result \b; incr i 2 }
f { append result \f; incr i 2 }
"\"" { append result "\""; incr i 2 }
/ { append result /; incr i 2 }
\\ { append result \\; incr i 2 }
u {
# Unicode escape - validate we have enough characters
if {$i + 5 >= $len} {
error "Invalid Unicode escape: not enough characters for \\uXXXX"
}
set hex [string range $str
[expr {$i+2}] [expr {$i+5}]]
# Validate hex digits (this
also catches empty string)
if {![string is xdigit -strict $hex] || [string length $hex] != 4} {
error "Invalid Unicode escape: \\u$hex must have exactly 4 hex digits"
}
scan $hex %x code
# Check if this is a high
surrogate (0xD800-0xDBFF)
if {$code >= 0xD800 && $code <= 0xDBFF} {
# High surrogate - must be followed by low surrogate
# Peek ahead for \uXXXX
set peek [string range $str [expr {$i+6}] [expr {$i+7}]]
if {$peek ne "\\u"} {
error "Orphaned high surrogate \\u$hex - expected low surrogate to follow"
}
# Get the low
surrogate hex
set hex2 [string range $str [expr {$i+8}] [expr {$i+11}]]
# Validate it's hex
and in low
surrogate range
if {![string is xdigit -strict $hex2] || [string length $hex2] != 4} {
error "Invalid Unicode escape after high surrogate: \\u$hex2"
}
scan $hex2 %x code2
if {$code2 < 0xDC00 || $code2 > 0xDFFF} {
error "Invalid surrogate pair: \\u$hex\\u$hex2 - second value must be DC00-DFFF"
}
# Combine surrogate
pair into actual
Unicode codepoint
set combined [expr {0x10000 + (($code & 0x3FF) << 10) + ($code2 & 0x3FF)}]
append result [format %c $combined]
incr i 12
} elseif {$code >= 0xDC00 && $code <= 0xDFFF} {
# Low surrogate without preceding high surrogate
error "Orphaned low surrogate \\u$hex - must follow a high surrogate"
} else {
# Normal BMP character
append result [format %c $code]
incr i 6
}
}
default {
# Unknown escape - in strict JSON this should error
# For now, keep as-is (could make this error with -strict mode)
append result \\$next
incr i 2
}
}
} else {
append result $c
incr i
}
}
return $result
}
et99 <et99@rocketship1.me> writes:
The code is getting better, no doubt.
- The crucial bug in parsing of not checking the remaining string lengh
if looking ahead is fixed.
- Some obvious optimisations as checking if the string has any escaped
characters (and a bit more). I wonder - did they pop up in the
generated code "by itself" at some point or did you help/asked for
(and if not, why wasn't they there right from the start)?
- There is code to handle the outside BMP character escapes. From
glancing about it looks OK. (But exactly that is one of the problems I
see with such generated code. To be somewhat certain one has to
carefully study and understand it. Even the very first convertEscapes
proc you presented here "looked" like OK code for the task (from a
distance) but was in fact way off.)
If you still have fun with this topic you may give this json test suite
a try: https://github.com/nst/JSONTestSuite
It is obvious that you have fun with your "friend" "Claude". I have to confess that I have mixed feelings about using language models for programming (and also in other areas).
On the one side it is long ago that programming meant punching holes
into a paper strip or card. Higher level programming languages by itself
are already helper tools. Our editors indent code automatically, propose completions and insert templates. We use search engines for information research. And so on.
What could go wrong if a new tool - a language model like your
"friend" - writes even more boilerplate code? Or much better the whole program?
Well, a lot, I'm afraid. The world gets flooded with code that "looks
like" it might work. (Programmers also make mistakes, at least I do and a
lot perhaps, but language models do this on a much faster rate.)
People will learn the hard way if such code works or they have to be
able to judge if the code is reliable. To be able to judge requires experience and often expert knowledge of the field in question.
But people will loose both (or more correct wont gain it). It will be
like checking the result of the calculation of a pocket calculator by
people who never have learned to do the calculation by pen and paper.
Other random thoughts: I really dislike how "AI" companies try to
monetize the commons public knowledge and even any non-free content
they get their hands on.
Hundreds of billion dollars of venture capital is already invested in
this field and this money wants a return of invest, no matter how.
As you mentioned most "AI" bots make silly errors if asked to code Tcl.
They are much better in python or javascript. That is at least for a
larger part of it just a matter of the much bigger code base available
for that language. Using this tools intensify the trend to mono-culture
in programming language.
This language models raises classical philosophical questions as "what
is understanding".
I'm getting much too long and far off topic. I'm pretty sure this
language model hype boost destroying of life's and society. But since we
are on that way even without and you have fun with it ...
rolf
If you still have fun with this topic you may give this json test suite
a try: https://github.com/nst/JSONTestSuite
What could go wrong if a new tool - a language model like your
"friend" - writes even more boilerplate code? Or much better the whole program?
On 9/21/2025 5:11 PM, Rolf Ade wrote:
If you still have fun with this topic you may give this json test
suite a try: https://github.com/nst/JSONTestSuite
Wow that was extremely useful!! Fun? I dunno about that...
Spent a day with Claude and found many incompatibilities with all the edge cases.
But let me let Claude tell you what we did (when I read this I hear the voice of hal9000 from 2001 in my head)
------
Looking back at our conversation (dave) :), we spent a substantial amount of time systematically improving your JSON parser's compliance with the JSONTestSuite. The work spanned multiple areas:
**Major phases we worked through:**
* Initial setup - Getting the JSONTestSuite files fetched and organized (around 300 test cases)
* Grammar validation - Adding the valid_followers transition table and validation logic
* Structural improvements - Fixing buildTypedStructure to handle scalar JSON and adding trailing token validation
* Lexer refinements - Adding leading zero detection and fixing whitespace rules
* Error message improvements - Enhanced error reporting with token details and context
**Key improvements made:**
* Comprehensive grammar validation catching invalid token sequences
* Proper scalar JSON support (standalone strings, numbers, booleans)
* Better error messages showing exactly what went wrong and where
* RFC 8259 compliance for whitespace and number formatting
* Robust bracket/brace matching with detailed error reporting
The progression was methodical - starting with about 30-40 failing
cases and systematically addressing each category of issues. The final
result of 99.6% compliance represents a significant achievement for a hand-written parser.
The time investment was substantial but resulted in a parser that's
both highly compliant with JSON standards and provides diagnostic
information when JSON is malformed. The systematic approach of using a comprehensive test suite to drive improvements proved very effective.
What could go wrong if a new tool - a language model like your
"friend" - writes even more boilerplate code? Or much better the whole
program?
I think that the ability to ask Claude about the code, why this, how
does this work, did you check for that, etc. is addressed in this
which I asked him to write:
Never really could make sense about the end of the movie.
Last time I checked (although that was a year ago) tDOM passed all
tests. (Gosh. I always thought of me as being lazy, slow and low
productive. And I did it by pen and paper.) Passing almost all of that
tests is pretty decent. If true, this software surely can be used in practice.
I understand your enthusiasm. And it was your time and the fun you have
with this tool which at the end maybe has brought a sometimes useful
library to the Tcl community.
You praise the tool and what your "friend" does. But that the end result
may useful required the guardiance and comments of another "friend"
which is familiar with the field at hand. Did you ask yourself the
question "why didn't my friend gave me all that comments I got from comp.lang.tcl"? Or in other words: Nice, that you have an answer bot.
But do you know the right questions?
You can give him urls of places to get info, but now that we no longer seem to have a web interface to comp.lang.tcl I had to paste in your remarks,
et99 <et99@rocketship1.me> posted:
You can give him urls of places to get info, but now that we no longer seem >> to have a web interface to comp.lang.tcl I had to paste in your remarks,
Actually my Newsgrouper project does provide a web interface to comp.lang.tcl and other groups - https://newsgrouper.org/comp.lang.tcl - but it requires
a login (which can be as an unregistered guest) to get past the login page.
Claude successfully converted the entire parser to use pure list
operations instead of mixed dict/list, eliminating the "shimmering"
between different internal representations and enabling full tDOM compatibility. It worked 2nd try, after I told him he missed a ; in an on-the-line comment. Not too shabby! He even gave me a report of
everything he changed.
On 9/24/2025 7:28 PM, et99 wrote:
Claude successfully converted the entire parser to use pure list operations instead of mixed dict/list, eliminating the "shimmering" between different internal representations and enabling full tDOM compatibility. It worked 2nd try, after I told him he missed a ; in an on-the-line comment. Not too shabby! He even gave me a report of everything he changed.
It has been interesting reading your success with Claude.
I wonder if it can do the same wonders on another project: As you may know, Expect doesn't run on Windows anymore. Perhaps you could ask your friend to see if he/she can resurrect it. That would be a big win for the community. The code and documentation is available so it has a good starting point just like it did with tDOM. Will it be able to actually fix problems and produce something new?
On 9/24/2025 12:14 PM, Colin Macleod wrote:
et99 <et99@rocketship1.me> posted:
You can give him urls of places to get info, but now that we no longer seem
to have a web interface to comp.lang.tcl I had to paste in your remarks,
Actually my Newsgrouper project does provide a web interface to comp.lang.tcl
and other groups - https://newsgrouper.org/comp.lang.tcl - but it requires a login (which can be as an unregistered guest) to get past the login page.
Thanks for the newsgrouper.org link - I tried it out and Claude can
read the login page but can't navigate the interactive guest access
button, so copy/paste remains the best approach for sharing newsgroup
content with it.
I wonder if it can do the same wonders on another project: As you may know, Expect doesn't run on Windows anymore. Perhaps you could ask your friend to see if he/she can resurrect it.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,072 |
Nodes: | 10 (0 / 10) |
Uptime: | 129:27:59 |
Calls: | 13,772 |
Files: | 186,986 |
D/L today: |
262 files (120M bytes) |
Messages: | 2,429,796 |