Saturday, May 19, 2018

Saturday, May 05, 2018

Run kubernetes on specific cluster or context

First list the context:
kubectl config get-contexts
Then either switch to a context:
kubectl config use-context $CONTEXT_NAME
Or simply run each command using the --context flag. For example to list the pods in a specific cluster run:
kubectl --context $CONTEXT_NAME get pods
To avoid verbosity, create functions in ~/.profile:
kubetest() {
    kubectl --context=$TEST_CONTEXT_NAME $@
}

kubeprod() {
    kubectl --context=$PROD_CONTEXT_NAME $@
}

Friday, April 06, 2018

Removing text blocks containing repetition with Unix or Linux Power Tools

Let us illustrate the issue with an example. In the Translation Industry a TMX file is an XML representation of a translation memory (TM). This format is useful to exchange TMs. It contains translation units (tu node) with properties (prop node) with translation unit variants (tuv node) and segments (seg node) that contain the source language and the target for translation language. Many times the same segment is added again and again by the Computer Aided Translation (CAT) Tool and while useful to get more precise translations it can become a burden if you try to process such a big TMX with an open source CAT Tool like OmegaT. Since OmegaT is client side only, processing big TMX would be problematic. In such case you might want to compromise on more precise translations versus being able to use the free tool. These repetitions are mostly related to the addition of context around the specific segment (x-context-post and x-context-post seg type attribute).

The question is then how to remove the whole "tu" node containing duplicated segments and leaving just one of them (again we are losing precision in the translation output but it might be worth it because of the savings when using a free CAT Tool).

The straightforward answer would be to export the TMX from the original tool using some options provided by such tool that would allow less data to be exported, specifically ignoring context specific translations. If that is not as possibility we are left with building a tool to clean it up.

First we can get an idea of which segments are duplicated and how many times each:
cat input.tmx | grep '<seg>' \
| sort | uniq -c | sort -nr \
| grep -v '^ *1 ' > tmx-repetitions.txt
Then we can replace them by a string like DUPLICATE_NODE_PLEASE_REMOVE
cat input.tmx \
| awk '{if($0 ~ /<eg>/ && !seen[$0]++ || $0 !~ /<seg>/) print $0; \
else print "DUPLICATE_NODE_PLEASE_REMOVE"}' > input-with-marked-duplicates.tmx
Finally we can try removing the whole translation unit (tu) node with perl:
cat input-with-marked-duplicates.tmx \
| perl -0pe 's#<tu(.*?)DUPLICATE(.*?)</tu>##gs'
But if the file is big enough this won't work as expected, probably because of how perl does multiline parsing in this particular commend (in memory). This is the reason why I built open sourced bash-multiline-replace project which contains a simple bash script (multilineReplace.sh) that will eliminate full blocks from start to end patterns if they contain an inner pattern.
cat input-with-marked-duplicates.tmx \
| ./multilineReplace.sh '<tu ' 'DUPLICATE' '</tu>' 

Saturday, March 24, 2018

Pdf Bash Tools - Ghostscript - Watermarks, password protection, search, split, merge and beyond

So much pdf processing that you can do including searching, splitting, merging, pdf password protection and watermarking. Yup, for free. Check and contribute to my pdf bash tools project.

Friday, March 16, 2018

Manage HP ProCurve Switches programmatically from *nix

Just released ProCurve Commander. Repeating yourself is not fun. This is not only true when it comes to management multiple switches but also to auditing them. This same idea can be used to manage Cisco switches and in general any device accessible via SSH but not friendly to remove command invocation.

Thursday, March 15, 2018

Hardening HP ProCurve HP switches

Enable SSH:
telnet 
# config
(config)# crypto key generate ssh
(config)# ip ssh
(config)# show ip ssh
(config)# exit
# exit
> exit
Confirm ssh works and disable telnet:
ssh 
# config
(config)# no telnet
(config)# exit
# exit
> exit
Change default users and set complex passwords:
password operator user-name 
password manager user-name 
Identify the switch:
# config
(config)# hostname "My ProCurve Switch  "

Wednesday, March 14, 2018

Java Applets in MAC OS X

Your only option is Safari, just as your only option is Internet Explorer for Windows. If the applet is insecure it won't run but you can always add exceptions at your own risk. From Apple System Preferences click on Java | Security tab | Edit Site List | Add | Apply | OK | Restart Safari.

Parsing CSV from bash

In one word csvtool.

To install it in Ubuntu:
sudo apt-get install csvtool
To install it in OS X:
brew install opam
opam init
eval `opam config env`
opam install csv
csvtool --help
To extract the second column (index 1) from sample.csv:
cat sample.csv | csvtool col 1 -
Find more from:
csvtool --help

Followers