Thank you.
Finally no error when compiling.
Here are the times – the code from my first post and the c#:
Lines: 48574
Find words 0.294 sec.
kappaArr: 1866
omicronArr: 0
upsilonArr: 0
Find words 0.037 sec.
: 0
Strange it returns 0 kappa matches while they are 1866.
The input data is the strToCheck, generated in the maxscript. I pass it without any modifications.
dp = dotnetobject "DocProcessor"
dp.ProcessDocument strToCheck
kp = dp.kappa
format ": %\n" kp.count
If I change the c# code to check for the first word on the first line(for example it is “phi” ) all the time the returned result is also 0.
Can we force the c# code to print directly in maxscript listener?
upd
Even if I remove all empty spaces at the beginning of the lines the result is 0.
did you try the link I posted above? It worked well and lists weren’t empty
yes, you can print to the listener, but you’ll have to add autodesk.max.dll as a dll reference (dll of the max where you run the script)
and call
Autodesk.Max.GlobalInterface.Instance.TheListener.EditStream.Wputs & Autodesk.Max.GlobalInterface.Instance.TheListener.EditStream.Flush
or
Autodesk.Max.GlobalInterface.Instance.TheListener.EditStream.Printf
as I understand it, the task is to find all occurrences of a specified substring in a large text file (~50k lines) by getting the line and position information. right?
I’m pretty sure finding all occurrences is very fast if you only get the position. so I would first find all the positions of the end of the lines and then the positions of the substring… after that it’s quick and easy to find the line of the substring.
re = python.import "re"
(
t0 = timestamp()
h0 = heapfree
n = (re.finditer "\n" strToCheck)
k = (re.finditer "kappa" strToCheck)
o = (re.finditer "omicron" strToCheck)
u = (re.finditer "upsilon" strToCheck)
format "time:% heap:%\n" (timestamp() - t0) (h0 - heapfree)
)
--time:9 heap:1008L
python is very well suited for all string related methods… so we can target those numbers
ok, managed to test the code in max and it turns out issue with zero matches was with the regex patterns required another couple of slashes for word boundry
var reOmicron = new Regex( \"^omicron\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
var reKappa = new Regex( \"^kappa\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
var reUpsilon = new Regex( \"^upsilon\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
code:
(
::dp_assembly = (
local source = "using System;
using System.Text;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class DocProcessor
{
List<int> _kappa = new List<int>();
List<string> _omicron = new List<string>();
List<string> _upsilon = new List<string>();
public int[] kappa {
get
{
return _kappa.ToArray();
}
}
public string[] omicron {
get
{
return _omicron.ToArray();
}
}
public string[] upsilon {
get
{
return _upsilon.ToArray();
}
}
public void ProcessDocument( string doc )
{
_kappa.Clear();
_omicron.Clear();
_upsilon.Clear();
var spaces = new char[]{' ',' '};
var reOmicron = new Regex( \"^omicron\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
var reKappa = new Regex( \"^kappa\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
var reUpsilon = new Regex( \"^upsilon\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
using (System.IO.StringReader sr = new System.IO.StringReader(doc))
{
int index = 0;
string line;
while ((line = sr.ReadLine()) != null)
{
var lline = line.TrimStart( spaces );
if ( reOmicron.IsMatch( lline ) )
{
_omicron.Add(line);
}
else
if ( reKappa.IsMatch( lline ) )
{
_kappa.Add( index );
}
else
if ( reUpsilon.IsMatch( lline ) )
{
_upsilon.Add( line );
}
index++;
}
}
}
}"
csharpProvider = dotnetobject "Microsoft.CSharp.CSharpCodeProvider"
compilerParams = dotnetobject "System.CodeDom.Compiler.CompilerParameters"
compilerParams.ReferencedAssemblies.Add("System.dll");
compilerParams.ReferencedAssemblies.Add("System.Windows.Forms.dll");
compilerParams.GenerateInMemory = on
compilerResults = csharpProvider.CompileAssemblyFromSource compilerParams #(source)
if (compilerResults.Errors.Count > 0 ) then
(
local errs = stringstream ""
for i = 0 to (compilerResults.Errors.Count-1) do
(
local err = compilerResults.Errors.Item[i]
format "Error:% Line:% Column:% %\n" err.ErrorNumber err.Line err.Column err.ErrorText to:errs
)
format "%\n" errs
undefined
)
else
(
compilerResults.CompiledAssembly
)
)
gc()
newLineStr = "\n"
newTabStr = "\t "
--( generate string
space = " "
nl = "\n"
tab = "\t"
tabNl = "\t\n"
fmt = "%\n"
wordsArr = #("alpha", "beta", "gama", "delta", "Epsilon", "Zeta", "Eta", "Theta", "Iota", "kaPPa", "LamBda", "mU", "Nu", "xi", "omicron", "pi", "rHo", "siGma", "Tau", "UpSiLoN", "pHi", "chi", "psi", "omega")
ss = stringStream ""
seed 12345
for i = 1 to 50000 do
(
wordsCnt = random 10 20
wArr = for j = 1 to wordsCnt collect (wordsArr[random 1 24])
tabCnt = random 0 5
str = ""
if mod i 17 == 0 then
str = tabNl
else
(
if mod i 33 == 0 then
str = nl
else
(
for t = 1 to tabCnt do str += tab
for w in wArr do str += space + w
)
)
format fmt str to:ss
)
--)
strToCheck = toLower (ss as string)
gc()
t0 = timestamp()
/*
ss = strToCheck as stringstream
seek ss 0
while not eof ss do
(
ln = trimLeft (readline ss) " "
if MatchPattern ln pattern:"kappa*" then count += 1 else
if MatchPattern ln pattern:"omicron*" then count += 1 else
if MatchPattern ln pattern:"upsilon*" do count += 1
)
*/
dp = (dotNetClass "System.Activator").CreateInstance (dp_assembly.GetType("DocProcessor"))
dp.ProcessDocument strToCheck
t1 = timestamp()
format "Find words % sec.\n" ((t1-t0)/1000.0)
format "kappaArr: %\n" dp.kappa.count
format "omicronArr: %\n" dp.omicron.count
format "upsilonArr: %\n" dp.upsilon.count
)
it might be a smarter way, but I use the first that comes to mind:
cmd = python.import "__builtin__"
cmd.list k as array
Serejah, Denis, Thank you.
Using this:
(
::dp_assembly = (
local source = "using System;
using System.Text;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class DocProcessor
{
List<int> _kappa = new List<int>();
List<string> _omicron = new List<string>();
List<string> _upsilon = new List<string>();
public int[] kappa {
get
{
return _kappa.ToArray();
}
}
public string[] omicron {
get
{
return _omicron.ToArray();
}
}
public string[] upsilon {
get
{
return _upsilon.ToArray();
}
}
public void ProcessDocument( string doc )
{
_kappa.Clear();
_omicron.Clear();
_upsilon.Clear();
var spaces = new char[]{' ',' '};
var reOmicron = new Regex( \"^omicron\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
var reKappa = new Regex( \"^kappa\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
var reUpsilon = new Regex( \"^upsilon\\\\b\", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled );
using (System.IO.StringReader sr = new System.IO.StringReader(doc))
{
int index = 0;
string line;
while ((line = sr.ReadLine()) != null)
{
var lline = line.TrimStart( spaces );
if ( reOmicron.IsMatch( lline ) )
{
_omicron.Add(line);
}
else
if ( reKappa.IsMatch( lline ) )
{
_kappa.Add( index );
}
else
if ( reUpsilon.IsMatch( lline ) )
{
_upsilon.Add( line );
}
index++;
}
}
}
}"
csharpProvider = dotnetobject "Microsoft.CSharp.CSharpCodeProvider"
compilerParams = dotnetobject "System.CodeDom.Compiler.CompilerParameters"
compilerParams.ReferencedAssemblies.Add("System.dll");
compilerParams.ReferencedAssemblies.Add("System.Windows.Forms.dll");
compilerParams.GenerateInMemory = on
compilerResults = csharpProvider.CompileAssemblyFromSource compilerParams #(source)
if (compilerResults.Errors.Count > 0 ) then
(
local errs = stringstream ""
for i = 0 to (compilerResults.Errors.Count-1) do
(
local err = compilerResults.Errors.Item[i]
format "Error:% Line:% Column:% %\n" err.ErrorNumber err.Line err.Column err.ErrorText to:errs
)
format "%\n" errs
undefined
)
else
(
compilerResults.CompiledAssembly
)
)
re = python.import "re"
cmd = python.import "__builtin__"
gc()
newLineStr = "\n"
newTabStr = "\t "
--( generate string
space = " "
nl = "\n"
tab = "\t"
tabNl = "\t\n"
fmt = "%\n"
wordsArr = #("alpha", "beta", "gama", "delta", "Epsilon", "Zeta", "Eta", "Theta", "Iota", "kaPPa", "LamBda", "mU", "Nu", "xi", "omicron", "pi", "rHo", "siGma", "Tau", "UpSiLoN", "pHi", "chi", "psi", "omega")
ss = stringStream ""
seed 12345
for i = 1 to 50000 do
(
wordsCnt = random 10 20
wArr = for j = 1 to wordsCnt collect (wordsArr[random 1 24])
tabCnt = random 0 5
str = ""
if mod i 17 == 0 then
str = tabNl
else
(
if mod i 33 == 0 then
str = nl
else
(
for t = 1 to tabCnt do str += tab
for w in wArr do str += space + w
)
)
format fmt str to:ss
)
--)
strToCheck = toLower (ss as string)
gc()
t0 = timestamp()
h0 = heapfree
dp = (dotNetClass "System.Activator").CreateInstance (dp_assembly.GetType("DocProcessor"))
dp.ProcessDocument strToCheck
t1 = timestamp()
format "C# % Heap: % \n" ((t1-t0)/1000.0) (h0 - heapfree)
format "kappaArr: %\n" dp.kappa.count
format "omicronArr: %\n" dp.omicron.count
format "upsilonArr: %\n" dp.upsilon.count
format ": %\n" dp.kappa[1]
gc()
t0 = timestamp()
h0 = heapfree
n = (re.finditer "\n" strToCheck)
k = (re.finditer "kappa" strToCheck)
o = (re.finditer "omicron" strToCheck)
u = (re.finditer "upsilon" strToCheck)
kappaArr = cmd.list k as array
omicronArr = cmd.list o as array
upsilonArr = cmd.list u as array
t1 = timestamp()
format "Python % Heap: % \n" ((t1-t0)/1000.0) (h0 - heapfree)
format "kappaArr: %\n" dp.kappa.count
format "omicronArr: %\n" dp.omicron.count
format "upsilonArr: %\n" dp.upsilon.count
format ": %\n" dp.kappa[1]
)
The times are:
C# time: 0.055 Heap: 852L
kappaArr: 1896
omicronArr: 1876
upsilonArr: 1900
dp.kappa[1]: 11
Python 0.183 Heap: 6498652L
kappaArr: 28094
omicronArr: 28527
upsilonArr: 28390
kappaArr[1]: <_sre.SRE_Match object at 0x0000023B5728E8B8>
OK
Denis, I hope I can use my python code the same way as you are using the python in your example.
I have posted a thread on Autodesk forum asking almost the same – how a python code can be executed inside maxscript. Maybe using your approach will allow me to convert my code to something usable. The main goal is to learn something new.
The C# time is pretty fast.