1

I have the below data (Actual output)

http://localhost:5058/uaa/token,80
https://t-mobile.com,443
http://USERSECURITYTOKEN/payments/security/jwttoken,80
https://core.op.api.internal.t-mobile.com/v1/oauth2/accesstoken?grant_type,443
http://AUTOPAYV3/payments/v3/autopay/search,80
http://AUTOPAYV3/payments/v3/autopay,80
http://CARDTYPEVALIDATION/payments/v4/internal/card-type-validation/getBinDetails,80

I am trying to get below data (Expected output)

localhost:5058/uaa/token,80
t-mobile.com,443
USERSECURITYTOKEN/payments/security/jwttoken,80
core.op.api.internal.t-mobile.com/v1/oauth2/accesstoken?grant_type,443
AUTOPAYV3/payments/v3/autopay/search,80
AUTOPAYV3/payments/v3/autopay,80
CARDTYPEVALIDATION/payments/v4/internal/card-type-validation/getBinDetails,80

and would like to combine working command with the below script

#!/bin/bash
for file in $(ls); 
do 
#echo  " --$file -- "; 
grep -P  '((?<=[^0-9.]|^)[1-9][0-9]{0,2}(\.([0-9]{0,3})){3}(?=[^0-9.]|$)|(http|ftp|https|ftps|sftp)://([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:/+#-]*[\w@?^=%&/+#-])?|\.port|\.host|contact-points|\.uri|\.endpoint)' $file|grep '^[^#]' |awk '{split($0,a,"#"); print a[1]}'|awk '{split($0,a,"="); print a[1],a[2]}'|sed 's/^\|#/,/g'|awk '/http:\/\//  {print $2,80}
       /https:\/\// {print $2,443}
       /Points/     {print $2,"9042"}
       /host/       {h=$2}
       /port/       {print h,$2; h=""}'|awk -F'[, ]' '{for(i=1;i<NF;i++){print $i,$NF}}'|awk 'BEGIN{OFS=","} {$1=$1} 1'|sed '/^[0-9]*$/d'|awk -F, '$1 != $2' 
done |awk '!a[$0]++' 
#echo "Done."
stty echo
cd ..

Need the solution ASAP, thank you in advance

1
  • Everything above stty echo could/should just be 1 awk script. Post a new question if you'd like help with writing that. Any time you find yourself writing chains of pipes involving multiple greps, seds, awks, etc. there is always a better approach.
    – Ed Morton
    May 6, 2020 at 20:46

2 Answers 2

2

@DopeGhoti has already posted an excellent answer.

While the original question has only "http://" and "https://" URIs in the example data, the Awk script that the poster included in the question seems to suggest they are expecting to also handle ftp,ftps and sftp methods as well.

So here's a generalized answer to remove any method (including any leading whitespace) from the start of the URI:

sed -E 's/^\s*.*:\/\///g'

and here's a link with some sample input for experimentation:

Try it online!

2
  • @KalpanaPinninty, I hope you get it working well. As a side note, it's generally a good idea to wait some time (maybe 24 hours or so) before accepting an answer, as it might discourage others from providing a better answer.
    – spuck
    May 6, 2020 at 23:29
  • Thank you @spuck, i will make a note on this :) May 10, 2020 at 19:51
2

Given the data in a file called input, with sed:

$ sed -E 's_^https?://__' input
localhost:5058/uaa/token,80
t-mobile.com,443
USERSECURITYTOKEN/payments/security/jwttoken,80
core.op.api.internal.t-mobile.com/v1/oauth2/accesstoken?grant_type,443
AUTOPAYV3/payments/v3/autopay/search,80
AUTOPAYV3/payments/v3/autopay,80
CARDTYPEVALIDATION/payments/v4/internal/card-type-validation/getBinDetails,80

Also, regarding

for file in $(ls); 

Don't parse the output of ls, you will be sad. Instead,

for file in *;
1
  • Thank you for reply, working on it. May 6, 2020 at 20:31

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .