2 views

Skip to first unread message

Jun 3, 2003, 4:53:16 PM6/3/03

to

Hi all...

i want to build a routine that generates unique character sequences in the

form AAAA where A is any letter except I an O (which look like 1 and 0

possibly)

its a long time since i had to do any maths but i think this would give me

24*24*24*24 possible unique combinations which is 331776

i am in the early stages of doing this but am imagining a set of four

cylinders just like a speedo mile counter but with 24 letters on each

cylinder...i start with each cylinder having a offset value (i don't want

them all to start at "A") and an increment value which is zero and will be

updated (to 24) when a letter is used.

i have loop which could be 1 to 331776 (in reality the offset and increment

values will be held in a table and the code will be called as a function to

generate one code at a time.

i also have an array arrLetters(24) which is loaded (1-24) with the letters

i am going to use.

so....

offset1=3:increment1=0

offset2=5:increment2=0

offset3=2:increment3=0

offset4=6:increment4=0

loop at 1

result="CEBF"

increment4=increment4+1

loop at 2

result="CEBG"

.

.

.

and so on

thats the easy bit....when i try and figure out how to make sure i cover

every possible "throw of the dice" it will need to increment each of the

cylinders in turn and then go back and combine different increment settings

on different cylinders etc etc...

i am sure someone out there can visualise this problem far better than i

can...i think what i have so far is sound and i can see how to loop through

to get the four sets of cylinders to go from 1 to 24..its covering all the

possible settings that i am struggling with...any ideas on how to proceed

would be appreciated

thanks

gdp

Jun 3, 2003, 5:19:40 PM6/3/03

to

Four cylinders are weak. I'm partial to the 77-79 Olds 403 V8, personally.

But it's gotta have 72 7a heads.

But it's gotta have 72 7a heads.

So, you just want a list that goes from AAAA to ZZZZ with all combinations

in between? Is that what you're saying?

You won't have 24^4 combinations though. You'll have 24^4-24^3, because

you're skipping A, B, C, ...AA, AB, AC..., AAA, AAB, AAC, .... It's kinda

like if you were to just keep yourself in base 10 and think about the number

of different combinations you'd have by restricting yourself to four digits.

At first it would seem that you'd have 10,000 (10^4). 0 up to 9,999. But,

you'd (in this example) be starting at 1000. So, you miss 0-999, which is

1000 numbers (or 10^3). But with the base 24, you'll still be left with

317,952 combinations...

You could either use an array of letters or just skip 9 and 15. If this is

the first reply to this, I suggest waiting for others, because there's

probably someone who will rip apart my inefficiency here. [:

for a = 1 to 26

if a = 9 or a = 15 then a = a + 1

for b = 1 to 26

if b = 9 or a = 15 then b = b + 1

for c = 1 to 26

if c = 9 or a = 15 then c = c + 1

for d = 1 to 26

if d = 9 or a = 15 then d = d + 1

response.write chr(a+64) & chr(b+64) & chr(c+64) & chr(d+64)

& "<br>"

next

next

next

next

Ray at work

"gdp" <gp014...@blueyonder.co.uk> wrote in message

news:1V7Da.691$RP6...@news-binary.blueyonder.co.uk...

Jun 3, 2003, 5:29:59 PM6/3/03

to

My 4 cylinder motorcycle would blow the doors off of your Olds.

"Ray at <%=sLocation%>" <r...@work.spam> wrote in message

news:%23UtOZXh...@TK2MSFTNGP12.phx.gbl...

Jun 3, 2003, 5:40:36 PM6/3/03

to

Well, uh, I guess I can't deny that. But I could have 5 18 year old girls

with me. :P

with me. :P

Ray at work

"Tom B" <shu...@hotmail.com> wrote in message

news:uXKyQdhK...@TK2MSFTNGP12.phx.gbl...

Jun 3, 2003, 5:46:55 PM6/3/03

to

excellent..it seems to do what i want...i ran your code Ray with a counter

added thus...

added thus...

for a = 1 to 26

if a = 9 or a = 15 then a = a + 1

for b = 1 to 26

if b = 9 or a = 15 then b = b + 1

for c = 1 to 26

if c = 9 or a = 15 then c = c + 1

for d = 1 to 26

if d = 9 or a = 15 then d = d + 1

c1=c1+1

response.write c1 & ": " & chr(a+64) & chr(b+64) & chr(c+64)

& chr(d+64) & "<br>"

next

next

next

next

i now have lots of letter sequences...however that last c1 written is

375000...this seems to be more than 24^4 or even 24^4-24^3?

thanks for the solution....much appreciated

gdp

"gdp" <gp014...@blueyonder.co.uk> wrote in message

news:1V7Da.691$RP6...@news-binary.blueyonder.co.uk...

Jun 3, 2003, 6:03:48 PM6/3/03

to

spot the typos!!!

check b=9 or a=15

should read

b=9 or b=15

and so on

Tim

"gdp" <gp014...@blueyonder.co.uk> wrote in message

news:kH8Da.9$Zh...@news-binary.blueyonder.co.uk...

Jun 4, 2003, 8:22:58 AM6/4/03

to

Alright, you win.

"Ray at <%=sLocation%>" <r...@work.spam> wrote in message

news:uRUxFjhK...@TK2MSFTNGP09.phx.gbl...

Jun 4, 2003, 12:34:38 PM6/4/03

to

"Ray at <%=sLocation%>" wrote:

>

> You won't have 24^4 combinations though. You'll have 24^4-24^3,

> because you're skipping A, B, C, ...AA, AB, AC..., AAA, AAB, AAC,

> ....

>

> You won't have 24^4 combinations though. You'll have 24^4-24^3,

> because you're skipping A, B, C, ...AA, AB, AC..., AAA, AAB, AAC,

> ....

That's not correct, Ray. There are 24^4 combinations. Read on:

> It's kinda like if you were to just keep yourself in base 10 and

> think about the number of different combinations you'd have by

> restricting yourself to four digits. At first it would seem that

> you'd have 10,000 (10^4). 0 up to 9,999. But, you'd (in this

> example) be starting at 1000.

Assuming you don't consider numbers like 0063 to be 4-digit numbers. We lop off

the leading zeros as a matter of convention (and for readability), but there's

no real need to do so. Certainly no such rule applies to strings. Which letter

would you designate as the "placeholder" digit?

> So, you miss 0-999, which is 1000 numbers (or 10^3). But with the

> base 24, you'll still be left with 317,952 combinations...

...which you can now see is not true.

--

Dave Anderson

Unsolicited commercial email will be read at a cost of $500 per message. Use of

this email address implies consent to these terms. Please do not contact me

directly or ask me to contact you directly for assistance. If your question is

worth asking, it's worth posting.

Jun 4, 2003, 1:02:53 PM6/4/03

to

Aside from the need to (correctly - as pointed out by Tim) evaluate and discard

unwanted character values, your loop is hopelessly inefficient. Here's a more

efficient method, and one that is extensible (more on that in a moment):

unwanted character values, your loop is hopelessly inefficient. Here's a more

efficient method, and one that is extensible (more on that in a moment):

function showCombinations() {

var a = "ABCDEFGHJKLMNPQRSTUVWXYZ".split(""),

counter = i = j = k = l = 0,

str = ""

for (i=0; i<a.length; i++)

for (j=0; j<a.length; j++)

for (k=0; k<a.length; k++)

for (l=0; l<a.length; l++) {

counter++

str += a[i] + a[j] + a[k] + a[l] + "<BR>"

}

return str + counter

}

You can test this on smaller sets by changing the initial string to something

like "ABC" (which yeilds 81 results, or 3^4, Ray). On top of that, you can add

as many different characters as you want, so you are no longer bound to 24^4

combinations. The sky is the limit.

"gdp" wrote:

>

> for a = 1 to 26

> if a = 9 or a = 15 then a = a + 1

> for b = 1 to 26

> if b = 9 or a = 15 then b = b + 1

> for c = 1 to 26

> if c = 9 or a = 15 then c = c + 1

> for d = 1 to 26

> if d = 9 or a = 15 then d = d + 1

> c1=c1+1

> response.write c1 & ": " & chr(a+64) & chr(b+64) & chr(c+64)

> & chr(d+64) & "<br>"

> next

> next

> next

> next

>

>

--

Jun 4, 2003, 1:32:43 PM6/4/03

to

You're correct. And now I'm trying to figure out how/why I wind up with

375,000 combinations.

375,000 combinations.

Ray at work

"Dave Anderson" <GTSPXO...@spammotel.com> wrote in message

news:eWYfmsrK...@TK2MSFTNGP12.phx.gbl...

Jun 4, 2003, 2:28:08 PM6/4/03

to

Using your code I found that number 79546 was offensive.

"Ray at <%=sLocation%>" <r...@work.spam> wrote in message

news:%23UtOZXh...@TK2MSFTNGP12.phx.gbl...

Jun 4, 2003, 2:38:08 PM6/4/03

to

79546? I see that as FCGV. Hmmm, maybe it's some other four letter word

that starts with an F. What could that be, I wonder...

that starts with an F. What could that be, I wonder...

Ray at work

"Tom B" <shu...@hotmail.com> wrote in message

news:eJRWQcsK...@TK2MSFTNGP11.phx.gbl...

Jun 4, 2003, 3:26:19 PM6/4/03

to

"Dave Anderson" <GTSPXO...@spammotel.com> wrote in message

news:uUk2psrK...@TK2MSFTNGP12.phx.gbl...> Aside from the need to (correctly - as pointed out by Tim) evaluate and

discard

> unwanted character values, your loop is hopelessly inefficient. Here's a

more

> efficient method, and one that is extensible (more on that in a moment):

>

> function showCombinations() {

> var a = "ABCDEFGHJKLMNPQRSTUVWXYZ".split(""),

> counter = i = j = k = l = 0,

> str = ""

> for (i=0; i<a.length; i++)

> for (j=0; j<a.length; j++)

> for (k=0; k<a.length; k++)

> for (l=0; l<a.length; l++) {

> counter++

> str += a[i] + a[j] + a[k] + a[l] + "<BR>"

> }

> return str + counter

> }

If you're after efficiency avoid/minimize the following:

1. Method calls (the call to split)

2. References to properties (a.length)

3. Array lookups (a[i])

4. String concatenation (str += a[i] + a[j] + a[k] + a[l] + "<BR>")

Here are my VBScript/JScript versions:

<script language="VBScript" runat="SERVER">

Dim intStart,arr,i,j,k,l,a,b,c

arr =

Array("A","B","C","D","E","F","G","H","J","K","L","M","N","P","Q","R","S","T

","U","V","W","X","Y","Z")

intStart = Timer

Response.Write "<pre>"

For i = 0 to 23

a=arr(i)

For j = 0 to 23

b=arr(j)

For k = 0 to 23

c=arr(k)

For l = 0 to 23

Response.Write arr(l)

Response.Write a

Response.Write b

Response.Write c

Response.Write " "

Next

Response.Write vbCRLF

Next

Next

Next

Response.Write "</pre>"

Response.Write "VBScript: " & (Timer - IntStart) & " seconds"

</script>

<script language="JavaScript" runat="SERVER">

var intStart;

var arr =

["A","B","C","D","E","F","G","H","J","K","L","M","N","P","Q","R","S","T","U"

,"V","W","X","Y","Z"];

var i,j,k,l,a,b,c;

intStart = new Date().getTime();

Response.Write("<pre>");

for (i=0;i<23;i++)

{

a=arr[i];

for (j=0;j<23;j++)

{

b=arr[j];

for (k=0;k<23;k++)

{

c=arr[k];

for (l=0;l<23;l++)

{

Response.Write(arr[l]);

Response.Write(a);

Response.Write(b);

Response.Write(c);

Response.Write(" ");

}

Response.Write("\n");

}

}

}

Response.Write("</pre>");

Response.Write("Javascript: " + (new Date().getTime() - intStart)/1000 + "

seconds");

</script>

Note: On my system, the VBScript ran ~6-7 times as fast as the JScript. I

find that troublesome.

HTH

-Chris

Jun 4, 2003, 6:23:46 PM6/4/03

to

"Chris Hohmann" wrote:

>

> If you're after efficiency avoid/minimize the following:

> 1. Method calls (the call to split)

>

> If you're after efficiency avoid/minimize the following:

> 1. Method calls (the call to split)

One call to split is not going to affect performance in a function with roughly

2 million concatenations.

> 2. References to properties (a.length)

> 3. Array lookups (a[i])

> 4. String concatenation (str += a[i] + a[j] + a[k] + a[l] + "<BR>")

These might apply to VBScript, but what about JScript? I took a look, and found

one big surprise. Read on:

I used my original JScript example and got rid of all concatenation, then ran it

10 times to get an average time. For the rest of this discussion, I'll use that

as the baseline, normalized to 100. See bottom of post for exact implementation

of baseline function.

For reference, my original averaged 1581, so mass concatenation clearly should

be frowned upon in JScript, just as it is in VBScript.

I next minimized array lookups, and the average dropped to 90.

After dropping length references, the average was 89. The function now is

equivalent to the one you posted.

I wondered if there was any overhead to Response.Write, and modified the code

slightly, using only one Response.Write per iteration of l, changing this:

Response.Write(x)

Response.Write(y)

Response.Write(z)

Response.Write(a[l])

Response.Write("<BR>")

to this:

Response.Write(x + y + z + a[l] + "<BR>")

The average dropped to 54. Quite a surprise. I wondered if I could gain more

performance by moving out a loop, using Response.Write only when k iterates:

for (k=0; k<len; k++) {

z = a[k]

for (l=0; l<len; l++) {

Response.Write(x + y + z + a[l] + "<BR>")

}

}

becomes:

for (k=0; k<len; k++) {

z = a[k]

for (l=0; l<len; l++) {

str += x + y + z + a[l] + "<BR>"

}

Response.Write(str)

str = ""

}

Average score: 37. It would appear as though *some* concatenation is preferable

to none. How far can it go? Let's move out another loop, to j:

for (j=0; j<len; j++) {

y = a[j]

for (k=0; k<len; k++) {

z = a[k]

for (l=0; l<len; l++) {

str += x + y + z + a[l] + "<BR>"

}

}

Response.Write(str)

str = ""

}

Average score goes up to 46. For completeness, I'll also state that one

Response.Write per iteration of i yeilds an average of 101. Clearly the slope

get steep fast, as one more move outward increases this 15-fold.

In conclusion, I think it's clear that (a) array lookups slightly affect

performance, (b) reading properties has a negligible affect, and (c) (at least

in JScript), there should be a balance between concatenation and

Response.Write(). All other tests I have seen posted here have focused on one

extreme or the other, but this one clearly shows there is a middle ground that

yeilds best performance.

> Note: On my system, the VBScript ran ~6-7 times as fast as the JScript. I

> find that troublesome.

I saw a similar results, but when I commented out the Response.Write statements

and just incremented a counter, the scores were virtually identical. That

suggests to me that JScript works a little harder to interface with the Response

Object than does VBScript.

Here's my baseline implementation, as promised:

function baseline() {

var a = "ABCDEFGHJK".split(""), i = j = k = l = 0

for (i=0; i<a.length; i++) {

for (j=0; j<a.length; j++) {

for (k=0; k<a.length; k++) {

for (l=0; l<a.length; l++) {

Response.Write(a[i])

Response.Write(a[j])

Response.Write(a[k])

Response.Write(a[l])

Response.Write("<BR>")

}

}

}

}

}

Comments are welcome.

Jun 6, 2003, 5:36:23 AM6/6/03

to

Yeah, Response.Write is a killer in JScript. It stems from the fact that

Response.Write is late bound in JScript and early bound in VBScript.

VBScript also has an optimization for Response.Write(constant). There's not

much that can be done about that except to minimize the calls to

Response.Write. Anyway, where possible, I moved concatenation outward.

Here's the code:

Response.Write is late bound in JScript and early bound in VBScript.

VBScript also has an optimization for Response.Write(constant). There's not

much that can be done about that except to minimize the calls to

Response.Write. Anyway, where possible, I moved concatenation outward.

Here's the code:

function fncJ1()

{

var intStart = new Date().getTime();

var arr =

["A","B","C","D","E","F","G","H","J","K","L","M","N","P","Q","R","S","T","U"

,"V","W","X","Y","Z"];

var i,j,k,l,a,b,c,d;

for (i=0;i<24;i++)

{

a="<br>"+arr[i];

for (j=0;j<24;j++)

{

b=a+arr[j];

for (k=0;k<24;k++)

{

c=b+arr[k];

d="";

for (l=0;l<24;l++)

{

d+=c+arr[l];

}

Response.Write(d);

}

}

}

return (new Date().getTime() - intStart)/1000;

}

Results:

Baseline : 100.00 : 15.092 sec

JScript0 : 32.78 : 4.947 sec

JScript1 : 16.92 : 2.553 sec

VBScript : 11.28 : 1.703 sec

Baseline is the baseline you included in your last post. JScript0 is your

optimized code. JScript1 are my further optimizations. VBScript is my

original VBScript solution. So when all is said and done, we've got the

JScript running at almost exactly 2/3 the speed of the VBScript. Or put

another way, the JScript take 1.5x as long as the VBScript to run. No bad,

not great. One thought that came to mind; there must be some way to

eliminate the duplicate array references, ie. each element is being access

24^4+24^3+24^2+24 = 346200 times! I took some stabs at hashing the character

set into sets of 6 and the iterating though possible combinations each

quartet but to no avail. I'd be intererted in your thought on this idea.

-Chris

Jun 6, 2003, 11:45:34 AM6/6/03

to

"Chris Hohmann" wrote:

>

> I took some stabs at hashing the character set into sets of 6

> and the iterating though possible combinations each quartet

> but to no avail. I'd be intererted in your thought on this idea.

>

> I took some stabs at hashing the character set into sets of 6

> and the iterating though possible combinations each quartet

> but to no avail. I'd be intererted in your thought on this idea.

Nice idea, but I believe you gain nothing because of the uniform distribution of

all pairs.

You've actually taken a stab at capitalizing on something called the "2nd-order

entropy" of the result set (see the first definition of information theoretic

entropy here: http://www.ciphersbyritter.com/GLOSSARY.HTM#Entropy). And since

the distribution was flat for both 1st and 2nd order, you probably saw a slight

decline in performance due to the increased algorithmic overhead. Am I correct?

Jun 7, 2003, 8:18:30 PM6/7/03

to

"Dave Anderson" <GTSPXO...@spammotel.com> wrote in message

news:OxRXuKEL...@TK2MSFTNGP10.phx.gbl...> "Chris Hohmann" wrote:

> >

> > I took some stabs at hashing the character set into sets of 6

> > and the iterating though possible combinations each quartet

> > but to no avail. I'd be intererted in your thought on this idea.

>

> Nice idea, but I believe you gain nothing because of the uniform

distribution of

> all pairs.

>

> You've actually taken a stab at capitalizing on something called the

"2nd-order

> entropy" of the result set (see the first definition of information

theoretic

> entropy here: http://www.ciphersbyritter.com/GLOSSARY.HTM#Entropy). And

since

> the distribution was flat for both 1st and 2nd order, you probably saw a

slight

> decline in performance due to the increased algorithmic overhead. Am I

correct?

You are indeed correct, I did see a slight decline in performance. Although

I'm not completely sold on the "2nd order entropy" argument. Primarily

because it's a little over my head. :) But secondly, my thinking is as

follows. If we can trade off array lookups for a more efficient operation,

then while the number of operations stays the same(or even goes up), the

total cost of execution could be brought down. I'll let you know what I come

up with.

-Chris

Jun 9, 2003, 1:38:57 PM6/9/03

to

"Chris Hohmann" wrote:

>

> You are indeed correct, I did see a slight decline in performance.

> Although I'm not completely sold on the "2nd order entropy" argument.

> Primarily because it's a little over my head. :)

>

> You are indeed correct, I did see a slight decline in performance.

> Although I'm not completely sold on the "2nd order entropy" argument.

> Primarily because it's a little over my head. :)

There's not much to it. You're attempting to solve a problem of much the same

type as data compression. You want to capitalize on redundancy. But you're only

able to really see performance gains when there's a great deal of redundancy

that can be abstracted.

In Information Theory, the level of redundancy is expressed as "Entropy", and

first-order entropy is basically a probability distribution based on frequency

of an individual character**.

Second-order entropy is a measure of the distribution of character pairs.

Because the number of elements is the square of those in the first-order case,

there is usually more overhead involved when performing second-order ops. And

usually that means worse performance, especially if the data set is small.

When you are processing a huge set of data with lots of redundancy (the text of

a book, for example), the extra overhead in second-order ops may be overcome by

the improved utilization of redundancy. But this was not such a case.

** More to the point: -SUM[p(i) * log2(p(i))^2]. This is minimized when the

probability distribution is flat -- when every character has equal probability.

And smaller numbers mean less redundancy, which means less opportunity to

capitalize on redundancy.

> But secondly, my thinking is as follows. If we can trade off array

> lookups for a more efficient operation, then while the number of

> operations stays the same (or even goes up), the total cost of

> execution could be brought down. I'll let you know what I come up

> with.

I'll be interested if you find a way to improve it. For now, I'm doubtful, if

only because of the flat distribution in this example.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu