If implementing it the Bitcoin way one variation I would consider is a more explicit marker. For backward compatibility purposes the “marker” that tell the script engine to use P2SH logic is the script pattern “HASH160 <script_hash> EQUALVERIFY”. We could make this more explicit with an OP code instead e.g. “OP_EXEC_SCRIPT_IF_HASH <script_hash>” (name of op code is open to debate, I was just being explicit about what it’s supposed to do.
If we do go the Bitcoin route, that sort of output script pattern matching isn’t (currently) necessary at all (or even possible) because transaction outputs do not contain any script. The only type of transaction which is currently functional is P2PKH, in which case the TxOut contains only the public key hash, and the redeeming TxIn contains the full P2PKH script (not only the signature and public key, but also the public key hash and all the verification opcodes). As described in my proposal for improved transaction serialization, my current plan is to to modify TxOuts to use a sort of enum format, so that rather than matching specific script patterns in the TxOut to determine how to interpret it, it’d simply say “This TxOut is P2PKH and the public key has this hash” or “This TxOut is P2SH and the lock script has this hash”. We could do the same for TxIns, so that P2PKH TxIns would just contain a public key and signature, and P2SH TxIns would contain <whatever we decide should go in there>.
I see why having a way to put scripts on the stack would be useful in some cases, but since (as described in my previous paragraph) we do have the flexibility here to add arbitrary new fields, I don’t see the advantage for the P2SH case vs just adding a second script field. To me it just seems like unnecessary additional complexity, because if we take that approach the only thing that we’ll ever end up doing with the serialized lock script is extracting it from the unlock script and deserializing it. Is this (enum-ifying TxIns/Outs) a bad idea? Should I instead focus on changing everything to use the more Bitcoin-like approach of a single script per TxOut and a single script per TxIn?
The tx_in_signable_string is necessary because our OP_CHECKSIG is different than the one in Bitcoin. Bitcoin’s OP_CHECKSIG has two operands (signature, pubkey) and the message which is verified is implicitly generated from the transaction inputs+outputs. Our OP_CHECKSIG has three operands (message, signature, pubkey), so if the script wants to verify that the signature is valid for this specific transaction, it needs to be able to access some kind of data which is unique to a particular transaction. This isn’t unique to P2SH; the current implementation of P2PKH already does this (all P2PKH scripts start off by pushing the tx_in_signable_string onto the stack, and the transaction is considered invalid if the string doesn’t match).