2017-02-10

The Powershell version of the "grep" command?

I needed to find text inside files in a folder and subfolders, and make a list of them where every file only is listed once. This is what I came up with.

First we get all .txt-files on the C-drive, -File specify what files, -Recurse makes it search all subfolders, -Force includes hidden files and -ErrorAction SilentlyContinue makes it ignore folders it cannot read inside.
Get-ChildItem C:\ -File *.txt -Recurse -Force -ErrorAction SilentlyContinue

Now we want to search inside these files, for this we use Select-String:
Select-String -Pattern "Microsoft" -ErrorAction SilentlyContinue
This will list the files once for each occurrence of "Microsoft" inside the files, we don't are about the text, all we want is the file name and path (-ExpandProperty will force powershell to give us the full path even if it is very long, If you only use -Property it may cut of the result):
Select-Object -ExpandProperty Path

We only want to see each file once:
Group-Object

And we want to use only the full path of the files:
Select-Object -ExpandProperty Name

If we put everything together it become:
Get-ChildItem C:\ -Recurse -File *.txt -ErrorAction SilentlyContinue | Select-String -Pattern "Microsoft" -ErrorAction SilentlyContinue | Select-Object -ExpandProperty Path | Group-Object | Select-Object -ExpandProperty Name

You can replace the search text and path with variables:
$SearchPath="C:\"
$SearchText="Anothertext"
$SearchFiles="*.txt"
Get-ChildItem $SearchPath -Recurse -File $SearchFiles -ErrorAction SilentlyContinue | Select-String -Pattern "$SearchText" -ErrorAction SilentlyContinue | Select-Object -ExpandProperty Path | Group-Object | Select-Object -ExpandProperty Name

If I want to do this the old way this is what the script would look like:
@ECHO OFF
FOR /F "delims=" %%A IN ('DIR /S /B "C:\*.txt"') DO (
   FINDSTR /M /I "Microsoft" "%%A"
)


I wanted to see if powershell is faster than an oldschool cmd script and I used the:
Measure-Command { put the command in here }

It took 11.95 seconds with Powershell and 32.52 seconds for CMD!

Please note that I have run the commands several time to make sure Windows have had time to cache the information. I started with Powershell and then CMD.

Can it be done with less pipes? I don't know, you tell me how!

To find files where "Microsoft" exists more than once (you can change "1" to whatever):
Get-ChildItem C:\ -Force -Recurse -File *.txt -ErrorAction SilentlyContinue | Select-String -Pattern "Microsoft" -ErrorAction SilentlyContinue | Select-Object -ExpandProperty Path | Group-Object | Where-Object {$_.count -gt 1} | Select-Object -ExpandProperty Name

If we want to use this command in our daily work it would be nice to make a Powershell Module.
It could looks something like this:

#=== FILE Get-FilesFindText.psm1 BEGINS ===
Function Get-FilesFindText {
    <#
    .SYNOPSIS
    Search for files that includes specified text
    .DESCRIPTION
    Windows Powershell version of Grep
    It does what you can do with:
    @ECHO OFF
    FOR /F "delims=" %%A IN ('DIR /S /B "C:\*.txt"') DO ( 
   FINDSTR /M /I "Microsoft" "%%A"
    )
    But in about 1/3 of the time!

    Copy the file Get-FilesFindText.psd1 & Get-FilesFindText.psm1 into a new folder called Get-FilesFindText
    Copy the new folder into "C:\Program Files\WindowsPowerShell\Modules"
    or another fodler in your $ENV:PSModulePath and run: Import-Module Get-FilesFindText
    Verify by running "Get-Modules". Get-FilesFindText Should be listed.
    .NOTES
    File name: Get-FilesFindText
    Author: David Djerf
    Blog: http://djerfhowididit.blogspot.se/
    .PARAMETER SearchPath
    What path to search, for example: "c:\Program Files"
    .PARAMETER SearchFiles
    What file type to search for, for example: "*.txt"
    .PARAMETER SearchText
    What text are you searching for, for example: "Microsoft"
    .PARAMETER ListOnlyIfMoreThan
    List only files that occurs in the result more than x times, must be an integer.
    .EXAMPLE
    Get-FilesFindText -SearchPath "C:\Program Files" -SearchText "Microsoft" -SearchFiles *.txt -ListOnlyIfMoreThan 1
    #>    
    param(
        [Parameter(Mandatory=$True,
                   HelpMessage='Specify a search path: [c:\Windows]')]
        [string]$SearchPath,
        [Parameter(Mandatory=$True,
                   HelpMessage='Specify text to search for [findme]')]
        [string]$SearchText,
        [Parameter(Mandatory=$True,
                   HelpMessage='Specify files to search for [*.txt]')]
        [string]$SearchFiles,
        [int]$ListOnlyIfMoreThan
    )
    BEGIN {
        IF (!$ListOnlyIfMoreThan) { $ListOnlyIfMoreThan=0 }
        $SearchPath=Resolve-Path $SearchPath # Get a proper path
                
    } # End of BEGIN
    PROCESS {
    IF ($SearchPath) { 
        $files=Get-ChildItem $SearchPath -Recurse -File $SearchFiles -ErrorAction SilentlyContinue | Select-String -Pattern "$SearchText" -ErrorAction SilentlyContinue | Select-Object -ExpandProperty Path | Group-Object | Where-Object {$_.count -gt $ListOnlyIfMoreThan } | Select-Object -ExpandProperty Name        
        foreach ($file in $files) {
            $properties = [ordered]@{'FilePath'=Resolve-Path $file
            } #End of properties
            # Make objects
            $output = New-Object -TypeName PSObject -Property $properties
            Write-Output $output
            } #End of foreach            
        } # End of IF
    } # End of Process
} # End of Function
#=== FILE Get-FilesFindText.psm1 ENDS ===

#=== FILE Get-FilesFindText.psd1 BEGINS ===

@{
    RootModule = 'Get-FilesFindText.psm1'
    ModuleVersion = '2017.2.10.2207'
    GUID = 'e6e80f2b-aaa4-41e0-a4a7-475c9fbbae69'
    Author = 'David Djerf'
    Description = 'Find text and list the file paths. Windows version of linux grep'
}
#=== FILE Get-FilesFindText.psd1 ENDS ===