torch.stack(tensors, dim)
stacks the tensors across dim
#usage
# data has to be tensor
torch.stack([data[i:i+some_number] for i in range(10)])
torch.from_numpy(numpy_array)
shares the memory with the numpy_array but is tensor type
a = np.array([1,2,3])
b = torch.tensor(a) # creates copy
c = torch.from_numpy(a) # shares memory
a[0] = 11
c
# outputs: tensor([11, 2, 3])
torch.flatten(input, start,end=-1)
flattens the input from dim start to end (-1 by default)
t = torch.tensor([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
torch.flatten(t)
torch.flatten(t, start_dim=1) # (2,2,2) --> (2,2*2)
[5, 6, 7, 8]])
torch.stack and torch.cat((tensors), dim)
torch.stack stacks tensors along new dim, whereas torch.cat concatenates along that specific dim.
example:
a = torch.randn(2,5,8,32)
b = torch.randn(2,1,8,32)
torch.cat((a,b), dim=1).shape
#outputs : torch.Size([2, 6, 8, 32])
a = torch.randn(3,5,8,32)
b = torch.randn(3,5,8,32)
torch.stack((a,b), dim=1).shape
#outputs: torch.Size([3, 2, 5, 8, 32])
For the past 2 years I’ve been involved in training and experimenting machine learning systems, mostly using third party packages such as sklearn, huggingface and so on. Sometimes the experiments become too specific and the abstraction provided by these packages become a bottleneck for the performance optimization. My research goal is to understand these bottlenecks in deep and write my own optimized code for hardware-specific optimization which enables resource efficient training or inference.