Parallel data processing in PowerShell is not quite simple, especially with
queueing. Try to use some existing tools which have this already done.
You may take look at the module
SplitPipeline. The cmdlet
Split-Pipeline
is designed for parallel input data processing and supports
queueing of input (see the parameter Load
). For example, for 4 parallel
pipelines with 10 input items each at a time the code will look like this:
$csv | Split-Pipeline -Count 4 -Load 10, 10 {process{
<operate on input item $_>
}} | Out-File $outputReport
All you have to do is to implement the code <operate on input item $_>
.
Parallel processing and queueing is done by this command.
UPDATE for the updated question code. Here is the prototype code with some
remarks. They are important. Doing work in parallel is not the same as
directly, there are some rules to follow.
$csv | Split-Pipeline -Count 4 -Load 10, 10 -Variable findSize {process{
# Tips
# - Operate on input object $_, i.e $_.PCname and $_.User
# - Use imported variable $findSize
# - Do not use Write-Host, use (for now) Write-Warning
# - Do not count issues (for now). This is possible but make it working
# without this at first.
# - Do not write data to a file, from several parallel pipelines this
# is not so trivial, just output data, they will be piped further to
# the log file
...
}} | Set-Content $report
# output from all jobs is joined and written to the report file
UPDATE: How to write progress information
SplitPipeline handled pretty well a 800 targets csv, amazing. Is there anyway
to let the user know if the script is alive...? Scan a big csv can take about
20 mins. Something like "in progress 25%","50%","75%"...
There are several options. The simplest is just to invoke Split-Pipeline
with
the switch -Verbose
. So you will get verbose messages about the progress and
see that the script is alive.
Another simple option is to write and watch verbose messages from the jobs,
e.g. Write-Verbose ... -Verbose
which will write messages even if
Split-Pipeline
is invoked without Verbose
.
And another option is to use proper progress messages with Write-Progress
.
See the scripts:
Test-ProgressTotal.ps1
also shows how to use a collector updated from jobs
concurrently. You can use the similar technique for counting issues (the
original question code does this). When all is done show the total number of
issues to a user.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…